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(57) Abstract 

Disclosed are genetic sequences and their encoded amino acid sequences for two interior nuclear matrix proteins useful as 
markers of malignant cell types. Primary and secondary structure analysis of the proteins is presented as well as means for their 
recombinant production, and compositions and methods for the use of these markers in clinical assays and cancer therapies. 
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Novel Malignant Cell Type Markers 
of the Interior Nuclear Matrix 

Background of the Invention 

All eucaryotic cells, both plant and animal, have a 
nucleus surrounded by the cell cytoplasm. The nucleus 
contains the cellular DNA complexed with protein and 
termed chromatin. The chromatin, with its associated 
5 proteins, constitutes the major portion of the nuclear 
mass and is organized by the internal protein skeleton 
of the nucleus, referred to here as the nuclear matrix 
(NM). The nuclear matrix also is defined as the 
nuclear structure that remains following removal of the 
10 chromatin by digestion with DNase I and extraction with 
high salt. This skeletal nuclear structure further is 
characterized by the "interior nuclear matrix" (INM) 
and the bounding nuclear pore-lamina complex. 

15 Diverse studies have implicated the NM in a wide 

variety of nuclear functions fundamental to the control 
of gene expression (For a general review see, for 
example. Fey et al. (1991) Crit. Rev. Euk. Gene 
Express j^: 127-143). In particular, as described in 

20 U.S. Pat. Nos. 4,882,268 and 4,885,236, it is now known 
that certain nuclear matrix proteins, specifically 
interior nuclear matrix proteins, are useful as marker 
proteins for identifying cell types. For example, the 
presence and abundance of particular INM proteins have 

25 been shown to be characteristic of specific cell types 
and can be used to identify the tissue of origin of a 
cell or cell fragment present in a sample. One 
particularly important application of this discovery is 
the use of marker INM proteins in evaluating metastatic 
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tissue. It is also known that the expression of 
certain INM proteins is altered in malignant or 
otherwise dysfunctional cells. The altered expression 
pattern of these proteins in maligrant and/or 
5 dysfunctioning cells also makes the proteins and 

nucleic acids encoding the proteins useful as marker 
proteins, alone or in combination, for diagnostic 
purposes and for evaluating tissue viability. US Pat. 
Nos. 4,882,268 and 4,885,236, issued 11/21/89 and 

10 12/5/89, respectively, to Penman and Fey, disclose a 
method for selectively extracting insoluble INM 
proteins and their associated nucleic acids from cells 
or cellular debris and distinguishing the expression 
pattern of these proteins in a particular cell type by 

15 displaying the proteins on a two-dimensional 

electrophoresis gel. In addition, it recently has been 
discovered that INM proteins or protein fragments also 
may be released in soluble form from dying cells. (US 
Application Serial No. 785,804, filed October 31, 

20 1991.) 

To date, molecular characterization of the specific 
proteins of the NM, particularly the INM, remain poorly 
defined due to the low abundance of these proteins in 

25 the cell and their generally insoluble character. The 
ability to isolate and characterize specific nuclear 
matrix proteins and the genetic sequences encoding them 
at the molecular level is anticipated to enhance the 
use of these proteins and their nucleic acids as marker 

30 molecules, and to enhance elucidation of the biological 
role of these proteins in vivo. 
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It is an object of this invention to provide 
genetic sequences encoding INM proteins useful as 
markers of malignant cell types. Another object is to 
provide enhanced means for identifying these proteins 
5 and their nucleic acids, including RNA transcripts, in 
samples. Yet another object of this invention is to 
provide compositions for use in diagnostic and other 
tissue evaluative procedures. Still another object is 
to provide genetic and amino acid sequences useful as 
10 target molecules in a cancer therapy. These and other 
objects and features of the invention will be apparent 
from the description, figures and claims which follow. 
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Summary of the Invention 

Molecular characterization data, including DNA 
sequence data, for two INM proteins now have been 
5 derived from an expression library, using monoclonal 
antibodies for these proteins. The proteins, 
designated herein as MTl and MT2, are present at 
elevated levels in malignant tissue and extracellular 
fluids. Accordingly, the proteins and the genetic 
10 sequences encoding them are thought to be useful as 

marker molecules for identifying tissue tumorgenesis in 
cell or body fluid samples. 

Full or partial clones of the genes encoding these 

15 proteins now have been isolated, and the DNA sequence, 
reading frames and encoded amino acid sequences of 
these DNAs determined. The DNA sequence for MT2 
corresponds to the sequence disclosed by Yang, et al. 
(1992) J. Cell Biol. 116:1303-1317, and Compton et al. 

20 (1992) J. Cell Biol. 116:1395-1408, referred to therein 
as NuMA. The nucleic acid (and the encoded amino acid 
sequence) described herein for MTl has not been 
described previously and also constitutes a novel 
sequence sharing little sequence homology with those 

25 sequences known in the art. In addition, MTl has been 
subcloned into an expression vector, and the protein 
expressed as a cleavable fusion protein in E. coli . 
Both the MTl and MT2 (NuMA) proteins are distributed 
throughout the nucleus (with the exception of the 

30 nucleolus) in non-mitotic cells, and localize to the 
spindle during mitosis, as determined immuno- 
f luoresence. 
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The genetic sequences described herein provide a 
family of proteins for each of the proteins MTl and 
MT2, including allelic and species variants of MTl and 
MT2. The family of proteins include these proteins 
5 produced by expression in a host cell from recombinant 
DNA, the DNA itself, and the host cells harboring and 
capable of expressing these nucleic acids. The 
recombinantly produced proteins may be isolated using 
standard methodologies such as affinity chromatography 
10 to yield substantially pure proteins. As used herein, 
"substantially pure" is understood to mean 
substantially free of undesired, contaminating 
proteinaceous material. 

15 The family of proteins defined by MTl includes 

proteins encoded by the nucleic acid sequence of Seq. 
ID No. 1, including analogs thereof. As used herein, 
"analog" is understood to include allelic and species 
variants, and other naturally-occurring and engineered 

20 mutants. These variants include both biologically 

active and inactive forms of the protein. Particularly 
envisioned are DNAs having a different preferred codon 
usage, those having "silent mutations" of the DNA of 
Seq. ID No.l, wherein the changes in the genetic 

25 sequence do not affect the encoded amino acid sequence, 
and DNAs encoding "conservative" amino acid changes, as 
defined by Dayoff et al.. Atlas of Protein Sequence and 
Structure; vol. 5, Supp. 3, pp 345-362 (M.O. Dayoff, 
ed., Nat'l Biomed. Research Foundation, Washington, 

30 D.C. 1979.) 

Accordingly, the nucleic acids encoding the protein 
family of MTl may be defined as those sequences which 
hybridize to the DNA sequence of Seq. ID No.l under 
35 stringent hybridization conditions. As used herein, 
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Stringent hybridization conditions are as defined in 
Molecular Cloning ; A Laboratory Manual , Maniatis, 
et al. eds«. Cold Spring Harbor Press, 1985, e.g.: 
hybridization in 50% formamide, 5x Denhardt's Solution, 
5 5 X SSPE, 0.1% SDS and 100 /jg/ml denatured salmon 

sperm, and washing in 2 x SSC, 0.1% SDS, at 37 ^'C, and 
1 X SSC, 0.1% SDS at SS^'C. 

The family of proteins defined by MT2 includes 
10 proteins^ encoded by the nucleic acid sequence of Seq. 
ID No. 3, including analogs thereof, including allelic 
and species variants, and other naturally-occurring and 
engineered mutants. These variants include both 
biologically active and inactive forms of the protein. 
15 Particularly envisioned are DNAs having silent 
mutations, other preferred codon usages, and DNAs 
encoding conservative amino acid changes. The nucleic 
acids encoding the protein family of MT2 of this 
invention may be defined as those sequences which 
20 hybridize with the DNA sequence of Seq. ID No. 3 under 
stringent hybridization conditions. 

In another aspect, the invention provides nucleic 
acid fragments ("oligonucleotides" or "oligomers") 

25 which hybridize to genetic sequences encoding MTl, but 
which do not necessarily encode functional proteins 
themselves. The oliognucleotides include probes for 
isolating genetic sequences encoding members of the MTl 
family of proteins from a cDNA or genomic DNA library, 

30 and/or for identifying genetic sequences naturally 

associated with the MTl protein coding sequence e.g., 

sequences— lying-upstream-or-downstream— f rom-the"COding~ 

sequences. For example, where the nucleic acid 
fragment is to be used as a probe to identify other 

35 members of the MTl family, the nucleic acid fragment 



may be a degenerate sequence as described in Molecular 
Cloning ; A Laboratory Manual / Maniatis, et al. eds.. 
Cold Spring Harbor Press, 1985, designed using the 
sequence of Seq. ID No.l as a template. Accordingly, 
the oligonucleotide or nucleic acid fragment may 
comprise part or all of the DNA sequence of Seq. ID 
No. 1, or may be a biosynthetic sequence based on the 
DNA sequence of Seq. ID No. 1. The oligonucleotide 
preferably is suitably labelled using conventional 
labelling techniques. 

The oligonucleotides also include sequences which 
hybridize with the mRNA transcript encoding the MTl 
protein. These complementary sequences are referred to 
in the art and herein as antisense sequences. 
Antisense sequences may comprise part or all of the 
sequence of Seq. ID No. 1, or they may be biosynthetic 
sequences designed using the sequence of Seq. ID No. 1 
as a template. 

In still another aspect, the invention provides 
oligonucleotides which hybridize to the genetic 
sequences encoding members of the MT2 protein family. 
The fragments include antisense sequences and sequences 
useful as probes for identifying members of the MT2 
family and/or for identifying associated "noncoding 
sequences. The hybridizing nucleic acids may comprise 
part or all of the sequence of Seq. ID No. 3 or may be 
biosynthetic sequences designed using the DNA sequence 
of Seq. ID No. 3 as a template, preferably suitably 
labelled using conventional techniques. 
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The genetic sequences identified herein encode 
proteins identified as marker proteins indicative of a 
malignancy or other cellular dysfunction in a tissue. 
Thus, in another aspect, the invention provides 
5 compositions for obtaining antibodies useful for 

detecting cancer marker proteins in a sample using the 
proteins described herein in combination with a 
suitable adjuvant. In another aspect, the invention 
provides genetic templates for designing sequences 

10 which hybridize specifically with the mRNA transcripts 
encoding these proteins. In still another aspect, the 
invention provides isolated DNA sequences for use in 
expressing proteins and protein fragments for the 
design of binding proteins, including antibodies, which 

15 interact specifically with an epitope on MTl or MT2. 
The invention also provides methods for evaluating the 
status of a tissue using the genetic sequences 
described herein, and the marker proteins encoded by 
them. Finally, the invention provides methods for 

20 treating a malignancy in an individual using these 
marker proteins, or the genetic sequences encoding 
them, as target molecules to inhibit or disable the 
cell's ability to undergo cell division. 
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Brief Descriptions of the Drawings 

Fig. lA-lD is a schematic representation of the 
5 amino acid sequence of MTl of Seq. ID No.l, showing: 

Fig. lA: the location of the proline residues; 
Fig. IB: the areas defined as ^-helices within the 
sequence; 

10 Fig. IC: the location of the cysteine residues; 

and 

Fig. ID: the sites of cleavage by NTCB; 

Fig. 2A-2B is a schematic representation of the 
15 amino acid sequence of MT2 of Seq. ID No. 3^ showing: 

Fig. 2A: the location of proline residues; and 
Fig. 2B: the areas defined as «-helices within the 
sequence; 

20 

Fig. 3: lists the levels of body fluid-soluble 
MT2 and MT2-associated 'protein quantitated in various 
normal and malignant tissue sample supernatants ; and 



25 



Fig. 4: lists the levels of body fluid-soluble 
MT2 and MT2-associated protein quantitated in sera 
isolated from cancer patients and normal blood donors. 
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Detailed Description 

In an attempt to characterize INM proteins useful 
as malignant cell markers in biological assays, the 
5 genetic sequences encoding two INM proteins, herein 
referred to as MTl and MT2, now have been identified 
and characterized. DNA sequences encoding these 
proteins now have been cloned by probing expression 
libraries using monoclonal antibodies raised against 

10 the isolated INM proteins MTl and MT2. The proteins 
were isolated from malignant cells essentially 
following the method of Penman and Fey, described in 
U.S. Pat. Nos. 4,882,268 and 4,885,236, the disclosures 
of which are herein incorporated by reference. The 

15 cloned DNAs, then were sequenced and their reading 

frames identified and analyzed. The genetic sequence 
encoding MT2 also has been disclosed by others (Yang, 
et al. (1992) J. Cell Biol . 116:1303-1317 and 
Compton et al. (1992) J. Cell. Biol . 116:1395-1408), 

20 and is referred to by them as "NuMA". Comparison of 
MTl and MT2 (NuMA) with other sequences in the art 
indicate that the sequences encoding these proteins 
constitute sequences sharing little homology with 
previously described sequences. 

25 

MTl also has been expressed as a cleavable fusion 
protein in coli and compared with the protein 
isolated from mammalian cells. Anti-MTl antibodies 
raised against the natural-sourced MTl protein also 
30 crossreact with the recombinantly produced protein. 
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Both the natural -sourced and recombinantly produced 
proteins have the same apparent molecular weight when 
analyzed by SDS-PAGE (90kD), equivalent pi values 
(5.4), and both proteins show the same cleavage pattern 
5 when cleaved with 2-nitro-3-thiocyanobenzoic acid 
(NTCB, see infra. ) 

Immunolocalization data on MTl indicates that MTl 
protein is distributed within the INM in non-mitotic 

10 cells as discrete punctate foci, nonuniformly 

distributed throughout the nucleoplasm of the INM. 
Specifically, the foci are present in the 
interchromatinic regions of the nucleus and are 
distributed in a stable association that remains after 

15 chromatin extraction, as is anticipated for an interior 
nuclear matrix protein. In addition, MTl foci are 
excluded from the nucleolus and the nuclear lamina. 
Moreover, during mitosis, the distribution of MTl 
changes and MTl becomes aligned in a stellate or star- 

20 shaped pattern at the spindle of the dividing cell. 

The protein does not co-localize with the chromosomes, 
suggesting that MTl may play a structural role during 
mitosis. The immunolocalization data is consistent 
with the MTl amino acid sequence analysis data which 

25 fails to find structural homology with any known DNA 
binding motifs, such as the "leucine zipper." 

While the MT2 (NuMA) protein has not yet been 
recombinantly expressed, the predicted molecular weight 
30 of 238 kDa for this protein, calculated from the 
predicted amino acid sequence (see Seq. ID No. 3), 
agrees with that of the natural-sourced material. 
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Immunolocalization studies on MT2 (NuMA) indicate 
that it also forms punctate foci located throughout the 
nucleoplasm of the non-mitotic cell, and also is 
excluded from the nucleolus. During mitosis the 
protein appears to migrate to the spindle poles of the 
dividing cell. The primary sequence appears to suggest 
a coiled-coil motif for the folded protein (Compton, et 
al. (1992) J, Cell Biol. 116:1395-1408; Yang, et al. 
(1992) J. Cell Biol . 116:1303-1317.) 

I . How to Use 

The nucleic acids disclosed herein encode proteins 
originally identified as marker proteins useful for 
identifying cell malignancies or other cell 
abnormalties. Specifically, significantly elevated 
levels of these proteins are detected in malignant 
cells and in extracellular fluids, e.g., sera, of 
cancer patients. (See PCT publication WO93/094 37 and 
infra.) For example, the presence and/or abundance of 
these proteins or their transcripts in a sample 
containing cells or cell nuclear debris may be used to 
determine whether a given tissue comprises malignant 
cells or cells having other abnormalities, such as 
chromosomal abnormalities. The sample may be an 
exfoliated cell sample or a body fluid sample, e.g., a 
sample comprising blood, serum, plasma, urine, semen, 
vaginal secretions, spinal fluid, saliva, ascitic 
fluid, peritoneal fluid, sputum, tissue swabs, and body 
exudates such as breast exudate. 

In— addition— because~INM~proteins^are"relie'a"s'ell~in 

soluble form from dying cells, the marker molecules may 
be used to evaluate the viability of a given tissue. 
For example, the marker proteins may be used to 
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evaluate the status of a disease or the efficacy of a 
therapeutic treatment or procedure, by monitoring the 
release of these marker molecules into a body fluid 
over a period of time. Particularly useful body fluids 
5 include blood, serum, plasma, urine, semen, vaginal 
secretions, spinal fluid, saliva, ascitic fluid, 
peritoneal fluid, sputum, tissue swabs, and body 
exudates such as breast exudate. Methods for 
performing these assays are disclosed in U.S. Pat. 
10 Nos. 4,882,268 and 4,885,236 and in co-pending U.S. 
application Serial Nos. 214,022, filed June 30, 1988 
and U.S. application Serial No. 785,804, filed October 
31, 1991, the disclosures of which all are herein 
incorporated by reference. 

15 

All of these assays are characterized by the 
following general procedural steps: 

1) detecting the presence and/or abundance of 
20 the marker protein or its transcript in "authentic" or 

reference samples; 

2) detecting the presence and/or abundance of 
the marker protein or its transcript in the sample of 

25 interest; and 

3) comparing the quantity of marker protein or 
its transcript in the sample of interest with the 
quantity present in the reference sample. 

30 

Where the assay is used to monitor tissue 
viability, the step of detecting the presence and 
abundance of the marker protein or its transcript in 
samples of interest is repeated at intervals and these 
35 values then are compared, the changes in the detected 
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concentrations reflecting changes in the status of the 
tissue. Where the assay is used to evaluate the 
efficacy of a therapy, the monitoring steps occur 
following administration of the therapeutic agent or 
5 procedure (e.g., following administration of a 
chemotherapeutic agent or following radiation 
treatment. ) 

It is not required that the selected marker protein 
10 or transcript be totally unique, in the sense that the 
particular INM marker molecule is present in the target 
cell type and in no other. Rather, it is required that 
the marker molecule have a signal to noise ratio high 
enough to discriminate the preselected cell type in 
15 samples for which the assay is designed. For example, 
MTl and MT2 proteins are useful as proteins indicating 
the presence of malignancy in cell samples because of 
their elevated expression levels in malignant cells, 
even though the proteins, or close analogs thereof, may 
20 be present commonly in nonmalignant cell types. 

A brief description of general protein and nucleic 
acid assay considerations follows below. Details of 
particular assay conditions may be found in the assay 
25 references described above and incorporated herein by 
reference, and in published protocols well known in the 
art and readily available. 

A. Protein Assays 

30 

Characterization of the MTl and MT2 proteins at the 
rSoleculaLr^level as descriBed~here"in allows o"ne"to 
characterize the proteins structurally and 
biochemically. Accordingly, following the disclosure 
35 of these genetic sequences and their encoded amino acid 
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sequences, preferred binding epitopes may be identified 
which may be used to enhance assay conditions. For 
example, binding proteins may be designed which have 
enhanced affinity for the marker protein produced by 
5 particular cell types or as a function of particular 
malignancies. Similarly, binding proteins may be 
designed which bind preferentially to protein fragments 
released from dying cells. In addition, structural 
and/or sequence variations between proteins produced in 

10 normal and abnormal tissue now may be investigated and 
used to advantage. The genetic sequences may be 
manipulated as desired, e.g., truncated, mutagenized or 
the like, using standard recombinant DNA procedures 
known in the art, to obtained proteins having desired 

15 features useful for antibody production. 

As will be appreciated by those skilled in the art, 
any means for specifically identifying and quantifying 
a marker protein of interest is contemplated. The 

20 currently preferred means for detecting a protein of 
interest in a sample is by means of a binding protein 
capable of interacting ' specif ically with the marker 
protein. Labelled antibodies or the binding portions 
thereof in particular may be used to advantage. The 

25 antibodies may be monoclonal or polyclonal in origin, 
or may be biosynthetically produced. The amount of 
complexed marker protein, e.g., the amount of marker 
protein associated with the binding protein, then is 
determined using standard protein detection 

30 methodologies well described in the art. 
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A. 1 . Immunoassays 

A variety of different forms of immunoassays 
currently exist, all of which may be adapted to detect 
5 and quantitate INM proteins and protein fragments. For 
exfoliated cell samples, as an example, the cells and 
surrounding fluid are collected and the INM proteins 
selectively isolated by the method of Penman and Fey, 
described in U.S. Pat. Nos. 4,882,268 and 4,885,236. 
10 These proteins then preferably are separated by two- 
dimensional gel electrophoresis and the presence of the 
marker protein detected by standard Western blot 
procedures .^ 

15 For serum and other fluid assays where the marker 

proteins and/or protein fragments to be detected exist 
primarily in solution, one of the currently most 
sensitive immunoassay formats is the sandwich 
technique. In this method, as described in PCT 

20 publication WO93/09437 and which has a precision 

typically of + 5%, two antibodies capable of binding 
the analyte-of interest generally are used: e.g., one 
immobilized onto a solid support, and one free in 
solution, but labeled with some easily detectable 

25 chemical compound. Examples of chemical labels that 
may be used for the second antibody include 
radioisotopes, fluorescent compounds, and enzymes or 
other molecules which generate colored or 
electrochemically active products when exposed to a 

30 reactant or enzyme substrate. When samples containing 
the marker protein or protein fragment are placed in 

^fehis—systemv—the-markeiv-protein— binds— to-both— the 

immobilized antibody and the labelled antibody. The 
result is a "sandwich" immune complex on the support's 

35 surface. The complexed protein is detected by washing 
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away nonbound sample components and excess labeled 
antibody, and measuring the amount of labeled antibody 
complexed to protein on the support's surface • The 
sandwich immunoassay is highly specific and very 
5 sensitive, provided that labels with good limits of 
detection are used. A detailed review of 
immunological assay design, theory and protocols can be 
found in numerous texts in the art, including Practical 
Immunology , Butt, W.R., ed.. Marcel Dekker, New York, 
10 1984- 

In general, immunoassay design considerations 
include preparation of antibodies (e.g., monoclonal or 
polyclonal) having sufficiently high binding 
specificity for their antigen that the specifically- 
bound antibody-antigen complex can be distinguished 
reliably from nonspecific interactions. As used 
herein, "antibody" is understood to include other 
binding proteins having appropriate binding affinity 
and specificity for the marker protein. The higher the 
antibody binding specificity, the lower the antigen 
concentration that can be detected. Currently 
preferred binding specificity is such that the binding 
protein has a binding affinity for the marker protein 
of greater than about 10^ M" S preferably greater than 
about 10^ M" ^ . 

Antibody binding domains also may be produced 
biosynthetically and the amino acid sequence of the 
30 binding domain manipulated to enhance binding affinity 
with a preferred epitope. Identification of the 
genetic sequences for MTl and MT2 can be used to 
advantage in the design and construction of preferred 
binding proteins. For example, a DNA encoding a 
35 preferred epitope may be recombinantly expressed and 



15 



20 



25 
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used to select an antibody which binds selectively to 
the eptiope. The selected antibodies then are exposed 
to the sample under conditions sufficient to allow 
specific binding of the antibody to its specific 
5 nuclear matrix protein or protein fragment, and the 
amount of complex formed then detected. Specific 
antibody methodologies are well understood and 
described in the literature. A more detailed 
description of their preparation can be found, for 
10 example, in Practical Immunology , Butt, W.R., ed.. 
Marcel bekker. New York, 1984. 

The choice of tagging label also will depend on the 
detection limitations desired. Enzyme assays (ELISAs) 

15 typically allow detection of a colored product formed 
by interaction of the enzyme-tagged complex with an 
enzyme substrate. Alternative labels include 
radioactive or fluorescent labels. The most sensitive 
label known to date is a chemiluminescent tag where 

20 interaction with a reactant results in the production 
of light. Useful labels include chemiluminescent 
molecules such as acridium esters or chemiluminescent 
enzymes where the reactant is an enzyme substrate. 
When, for example, acridium esters are reacted with an 

25 alkaline peroxide solution, an intense flash of light 
is emitted, allowing the limit of detection to be 
increased 100 to 10,000 times over those provided by 
other labels. In addition, the reaction is rapid. A 
detailed review of chemiluminescence and immunoassays 

30 can be found in Weeks, et al., (1983) Methods in 

Enzymoloqy 133 ;366-387 . Other considerations for fluid 

assays— include-the-use~of"microtiter~weTrs"oT~"coll^ 
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immunoassays. Column assays may be particularly 
advantageous where rapidly reacting labels, such as 
chemiluminescent labels, are used. The tagged complex 
can be eluted to a post-column detector which also 
5 contains the reactant or enzyme substrate, allowing the 
subsequent product formed to be detected immediately. 

A. 2. Antibody Production 

10 The proteins described herein may be used to raise 

antibodies using standard immunological procedures well 
known and described in the art. See, for example. 
Practical Immunology , Butt, N.R., ed., Marchel Dekker, 
NY, 1984. Briefly, an isolated INM protein produced, 

15 for example, by recombinant DNA expression in a host 
cell, is used to raise antibodies in a xenogenic host. 
Preferred antibodies are antibodies that bind 
specifically to an epitope on the protein, preferably 
having a binding affinity greater than lO^M'S most 

20 preferably having an affinity greater than lO^M"^ for 
that epitope. For example, where antibodies to a human 
INM protein, e.g. MTl or MT2 is desired, a suitable 
antibody generating host is a mouse, goat, rabbit, 
guinea pig, or other mammal useful for generating 

25 antibodies. The protein is combined with a suitable 
adjuvant capable of enhancing antibody pf'oduction in 
the host, and injected into the host, for example, by 
intraperitoneal administration. Any adjuvant suitable 
for stimulating the host's immune response may be used 

30 to advantage. A currently preferred adjuvant is 
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Freund's complete adjuvant (an emulsion comprising 
killed and dried microbial cells, e.g., from Calbiochem 
Corp., San Diego, or Gibco, Grand Island, NY). Where 
multiple antigen injections are desired, the subsequent 
5 injections comprise the antigen in combination with an 
incomplete adjuvant (e.g. cell-free emulsion). 

Polyclonal antibodies may be isolated from the 
antibody-producing host by extracting serum containing 

10 antibodies to the protein of interest. Monoclonal 

antibodies may be produced by isolating host cells that 
produce the desired antibody, fusing these cells with 
myeloma cells using standard procedures known in the 
immunology art, and screening for hybrid cells 

15 (hybridomas) that react specifically with the INM 
protein and have the desired binding affinity. 

Provided below is an exemplary protocol for 
monoclonal antibody production, which is currently 

20 preferred. Other protocols also are envisioned. 
Accordingly, the particular method of producing 
antibodies with the cancer marker protein compositions 
of this invention, is not envisioned to be an aspect of 
the invention. Also described below are exemplary 

25 sandwich immunoassays and dot blot assays useful for 
detecting and/or quantitating marker proteins in a 
sample. Other means for detecting marker proteins, 
particularly MTl, MT2 and their analogs, including 
protein fragments and naturally-occurring variants, 

30 also are envisioned. These other methods are well-known 
and described in the art. 
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Exemplary antibody production protocol ; Balb/c by 
J mice (Jackson Laboratory, Bar Harbor, ME) are 
injected intraperitoneally with purified INM protein 
(e.g., MTl) purified from the human cervical cell line 
5 CaSki, every 2 weeks for a total of 16 weeks. The mice 
are injected with a single boost 4 days prior to 
sacrifice and removal of the spleen. Freund's complete 
adjuvant (Gibco, Grand Island) is used in the first 
injection, incomplete Freund's in the second injection; 

10 subsequent injections are made with saline. Spleen 

cells (or lymph node cells) then are fused with a mouse 
myeloma line, e.g., using the method of Kohler and 
Milstein (1975) Nature 256 ; 495, the disclosure of which 
is incorporated herein by reference, and using 

15 polyethylene glycol (PEG, Boehringer Mannheim, 

Germany). Hybridomas producing antibodies that react 
with nuclear matrix proteins then are cloned and grown 
as ascites. Hybridomas are screened by nuclear 
reactivity against the cell line that is the source of 

20 the immunogen, and by tissue immunochemistry using 
standard procedures known in the immunology art. 
Detailed descriptions df screening protocols, ascites 
production and immunoassays also are disclosed in PCT 
publication WO93/09437. 

25 

Exemplary Assays ; 

A. Sandwich Immunoassay (ELISA) 

30 A standard immunoassay can be performed to generate 

dose response curves for antigen binding, for cross 
reactivity assays, and for monitoring assays. The data 



wo 94/00573 



PCr/US93/06160 



- 22 - 

is generated with a standard preparation of NM antigen, 
and is used as the reference standard when body fluids 
are assayed. In these examples both ELISAs and 
radioummunoassays were performed. 

5 

1. Immunoassay (Well Assay) 

Microtitre plates (Immulon II, Dynatech, Chantilly, 
VA) are coated with purified antibody at 5 to 15ug/ml 
10 in PBS at pH 7.4 for Ihr or overnight and then washed 3 
X with 300^1 PBS. The plates then are blocked with 10% 
noirmal goat serum in PBS for Ihr at room temp and 
washed 3 x with 300pl of PBS. An exemplary protocol 
follows . 

15 

Here, samples are assayed by pipetting 100/jI of 
sample per well, and incubating for Ihr at RT. The 
wells were washed with 3 x 300/jl PBS. lOOpl of 1.25 to 
lOp/g/ml of a biotinylated antibody added to each well, 

20 incubated for Ihr at RT and washed with 3 x 300/j1 of 
PBS. lOOpl of a 1:1000 dilution of streptavidin- 
horseradish peroxidase conjugate (The Binding Site 
Ltd., Birmingham, UK) added to each well and incubated 
for Ihr and then washed with PBS. lOOpl of peroxidase 

25 substrate (citrate, phosphate, OPD-H2O2) then is added 
to each well and incubated for 20min. The reaction is 
stopped by adding BOfjl of IM H2S0^ to the wells. The 
optical density is read on a plate reader at 490nm. 

30 Concentrations of antigen are determined by 

preparing a reference concentration of antigen and 
prepariTig a st^lTaaTd~^iTution curve to compare witlirtfie 
unknown samples. 
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2. IRMA ( Inununoradiometric Assay) 
(a) lodination of Streptavidin. 

10/jg of streptavidin (Sigma, Inc., Cincinnati) in 
5 2pl of 0.05M phosphate pH 7.4 was added to lOpl of 
0.25M phosphate pH 7.4 in a microcentrifuge tube and 
ImCi of ^^^I (NEN-DUPONT, Wilmington, DE) in lOpl is 
added. Immediately lOpl of lOOmg chloramine-T 
trihydrate (Sigma, Inc.) in 50ml of distilled water is 

10 added, mixed, and reacted for 25sec. The reaction then 
is stopped by mixing for 20sec with 50^1 of 40mg 
Cysteamine ( 2-mercaptoethlyamine) ( Sigma, Inc.) and 5mg 
KI in 50ml of 0.05M phosphate pH 7.4. 0.5ml of 1% BSA 
in PBS pH 7.4 added and the material fractionated on a 

15 10ml sephadex G-lOO column (Pharmacia, Sweden) pre- 
equilibrated with the BSA PBS buffer. 30 by 0.5ml 
fractions are collected and 10/j1 diluted to 1ml of the 
BSA/PBS buffer for each fraction. 100/jI of the diluted 
fraction is counted on a LKB gamma counter set for 

20 ^^^I. The specific activity is calculated and 

routinely falls between 85 to lOOuCi/ug. The mid 
fractions of the protein peak then are used in the 
sandwich immunoassay. 

25 (b) Sandwich Radioimmunoassay. 

The microtitre breakaway wells (Immulon II 
Removawell strips, Dynatech, Chantilly, Va) are coated 
and blocked as in the ELISA assay. The samples, 

30 standard or sera, are routinely measured by incubating 
100^1 in the wells for Ihr at RT washing on a plate 
washer with 3 x 300^1 of PBS and then incubated with 
the biotinylated antibody (2-10/jg/ml in 10% goat serum) 
for Ihr at RT and washed again. The bound biotinylated 

35 antibody is detected with the ^ ^ ^ I-streptavidin. 



# 
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200,000 to 300,000 cpm (77% counter efficiency) in 
lOOpl is added to each well and incubated for Ihr at RT 
and washed again. The bound fraction is detected by 
counting the radioactivity in an LKB gamma counter. 
5 The concentration can be determined by comparing the 
counts obtained against a reference preparation. 

B. Dot Blot Detection of NM. 

10 Antibody reactivity with NM proteins can be 

assessed by dot blot detection assays, using standard 
methodologies and apparatus (e.g. , Schleicher & 
Schuell). Nitrocellulose membranes are soaked in Tris 
buffered saline, (TBS, 50mM TRIS, ISOmM NaCl, pH 7.6) 
15 and NM preparation applied at varying concentrations of 
protein to a series of wells and incubated for Ihr at 
room temperature (e.g., T-47D NM supernatant at 
lOpg/ml, IjL/g/ml and lOOng/ml). The blocked wells then 
are washed with 2 x 200^/1 of TBS and then blocked with 
20 100/jl 10% normal goat serum in TBS for Ihr at room 

temperature. The blocked wells then are washed again 
with 2 X 200^1 of TBS and lOOp/l of culture supernatant 
containing nuclear reactive antibody to be tested is 
added to their respective wells and incubated for Ihr 
25 at room temperature. The wells then are washed with 2 
X 200/j1 of TBS and lOOpl of a dilution series of 
alkaline phosphatase conjugated goat anti-mouse IgG 
(Bio-Rad, Richmond, CA) (e.g., 1:1000, 1: 5000, or 
1:10000) added to the relevant wells and incubated for 
30 Ihr. The wells then are washed with 2 x 200pl of TBS 
followed by addition of enzyme substrate (BCIP/NBT, 

Kirkga'ard~a'nd~P'errir;~Ga'ithWsb'^^^ 

Tris buffer containing Levamisole (Vector, Inc., Corpus 
Christi, TX.) A fifteen minute incubation generally is 
35 sufficient. The reaction can be stopped by washing 
with distilled water and the product detected. 
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B. Nucleic Acid Assays 

The status of a tissue also may be determined by 
detecting the quantity of transcripts encoding these 
5 cancer marker proteins. The currently preferred means 
for detecting mRNA is by means of northern blot 
analysis using labeled oligonucleotides e.g., nucleic 
acid fragments capable of hybridizing specifically with 
the transcript of interest. The currently preferred 

10 oligonucleotide sequence is a sequence encoding a 

complementary sequence to that of at least part of the 
transcript marker sequence. These complementary 
sequences are known in the art as "antisense" 
sequences. The oligonucleotides may be 

15 oligoribonucleotides or oligodeoxyribonucleotides . In 
addition, oligonucleotides may be natural oligomers 
composed of the biologically significant nucleotides, 
i.e., A (adenine), dA (deoxyadenine) , G (guanine), dG 
(deoxyguanine) , C (cytosine), dC (deoxycytosine) , T 

20 (thymine) and U (uracil), or modified oligonucleotide 
species, substituting, for example, a methyl group or a 
sulfur atom for a phosphate oxygen in the inter- 
nucleotide phosohodiester linkage. (see, for example, 
Section I.C, below,) Additionally, the nucleotides 

25 themselves, and/or the ribose moieties may be modified. 

The sequences may be synthesized chemically, using 
any of the known chemical oligonucleotide synthesis 
methods well described in the art. For example, the 

30 oligonucleotides are advantageously prepared by using 
any of the commercially available, automated nucleic 
acid synthesizers. Alternatively, the oligonucleotides 
may be created by standard recombinant DNA techniques, 
by, for example, inducing transcription of the 

35 noncoding strand. For example, the DNA sequence 
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encoding a marker protein may be inverted in a 
recombinant DNA system, e.g., inserted in reverse 
orientation downstream of a suitable promoter, such 
that the noncoding strand now is transcribed. 

5 

Useful hybridizing oligonucleotide sequences 
include any sequences capable of hybridizing 
specifically to the MTl or MT2 primary transcripts. 
Accordingly, as will be appreciated by those skilled in 

10 the art, useful sequences contemplated include both 
sequences complementary to the DNA sequences provided 
in Seq. ID No. 1 (MTl) or Seq. ID No. 3 {MT2) which 
correspond to the protein coding regions, as well as 
sequences complementary to transcript sequences 

15 occurring further upstream or downstream from the 
coding sequence (e.g., sequences contained in, or 
extending into, the 5'- and 3' untranslated regions). 
Representative antisense sequences are described in 
Seq. ID Nos. 5 and 6. Seq. ID No. 5 describes a 

20 sequence complementary to the first 100 nucleotides of 
the MTl protein coding sequence (compare Seq. ID 
Nos« 1 and 5) as well as the 53 nucleotide sequence 
occurring upstream of the initiation codon. The 
complementary nucleotides to the initiation codon occur 

25 at positions 298-300 in Seq. ID No. 5. Similarly, Seq. 
ID No. 6 describes a sequence complementary to the 
first 100 nucleotides of the MT2 protein coding 
sequence (compare Seq. ID Nos. 3 and 6), as well as the 
48 nucleotide sequence occurring upstream of the 

30 initiation codon. The complementary nucleotides to the 
initiation codon occur at positions 298-300 in Seq. ID 

No.— 5-. — Useful— ol-igomers-may—be-created—based-on-part — 

or all of the sequences in Seq. ID No. 5 and 6. 
However, as will be appreciated by those skilled in the 

35 art, other useful sequences which hybridize to other 
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regions of the transcript readily are created based on 
the sequences presented in Seq. ID Nos. 1 and 3 and/or 
additional, untranslated sequences, such as are 
disclosed for MT2 (NuMA) in Compton et al. and Yang et 
5 al. 

While any length oligonucleotide may be utilized to 
hybridize an mRNA transcript, sequences less than 8-15 
nucleotides may be less specific in hybridizing to 
10 target mRNA. Accordingly, oligonucleotides typically 
within the range of 8-100 nucleotides, preferably 
within the range of 15-50, nucleotides are envisioned 
to be most useful in standard RNA hybridization assays. 

15 The oligonucleotide selected for hybridizing to the 

INM transcript, whether synthesized chemically or by 
recombinant DNA, then is isolated and purified using 
standard techniques and then preferably labelled (e.g., 
with ^^S or ^^P) using standard labelling protocols. 

20 

A sample containing the marker transcript of 
interest then is run oh an electrophoresis gel, the 
dispersed nucleic acids transferred to a nitrocellulose 
filter and the labelled oligonucleotide exposed to the 

25 filter under suitable hybridizing conditions, e.g. 50% 
formamide, 5 X SSPE, 2 X Denhardt's solution, 0.1% SDS 
at 42° C, as described in Molecular Cloning ; A 
Laboratory Manual , Maniatis et al. Other useful 
procedures known in the art include solution 

30 hybridization, and dot and slot RNA hybridization. The 
amount of marker transcript present in a sample then is 
quantitated by measuring the radioactivity of 
hybridized fragments, using standard procedures known 
in the art. 



35 
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Following a similar protocol, oligonucleotides also 
may be used to identify other sequences encoding 
members of the MTl and MT2 protein families, for 
example, as described in the examples that follow. The 
5 methodology also may be used to identify genetic 

sequences associated with the protein coding sequences 
described herein, e.g., to identify noncoding sequences 
lying upstream or downstream of the protein coding 
sequence, and which may play a functional role in 

10 expression of these genes. Where new marker species 
are to be identified, degenerate sequences and/or 
sequences with preferred codon bias may be created, 
using the sequences of Seq. ID Nos. 1 or 3 as 
templates, and the general guidelines described in the 

15 art for incorporating degeneracies. (See, for example. 
Molecular Cloning ; A Laboratory Manual , Maniatis , et 
al. ) 

C. Therapeutics 

20 

The proteins described herein are associated with 
the spindle apparatus during mitosis, and are present 
at elevated levels in malignant cells. Accordingly, 
without being limited to any particular theory, it is 
25 hypothesized that the proteins likely play a 

significant role in cell division, most likely a 
structurally related role. Accordingly, these proteins 
and their transcripts are good candidates as target 
molecules for a cancer chemotherapy. 

30 
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C.l Antisense Therapeutics 

A particularly useful cancer therapeutic envisioned 
is an oligonucleotide complementary to part all of the 
5 marker transcript, capable of hybridizing specifically 
to the transcript and inhibiting translation of the 
mRNA when hybridized to the mRNA transcript. Antisense 
oligonucleotides have been used extensively to inhibit 
gene expression in normal and abnormal cells. See, for 

10 example. Stein et al. (1988) Cancer Res. 48:2659-2668, 
for a pertinent review of antisense theory and 
established protocols. Accordingly, the antisense 
nucleotides to MTl and MT2 may be used as part of 
chemotherapy, alone or in combination with other 

15 therapies. 

As described in Section I.B above, both 
oligoribonucleotide and oligodeoxyribonucleotide 
sequences will hybridize to an MRNA transcript and may 
20 be used to inhibit mRNA translation of the marker 
protein. described herein. However, 

oligoribonucleotides generally are more susceptible to 
enzymatic attack by ribonucleases than 
deoxyribonucleotides . Hence, oligodeoxyribonucleotides 
25 are preferred for in vivo therapeutic use to inhibit 
mRNA translation in an individual. 

Also, as described in Section I.B above, the 
therapeutically useful antisense oligonucleotides of 
30 the invention may be synthesized by any of the known 
chemical oligonucleotide synthesis methods well 
described in the art. Alternatively, a complementary 
sequence to part or all of the natural mRNA sequence 
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may be generated using standard recombinant DNA 
technology. For example, the DNA encoding the protein 
coding sequence may be inserted in reverse orientation 
downstream of a promoter capable of expressing the 
5 sequence such that the noncoding strand is transcribed. 

Since the complete nucleotide sequence of the 
protein coding sequence as well as additional 5' and 3' 
untranslated sequences are known for both MTl and MT2 

10 (see, for example, Seq. ID Nos. 1 and 3 and Compton 

et al.), and/or can be determined with this disclosure, 
antisense oligonucleotides hybridizable with any 
portion of the mRNA transcripts to these proteins may 
be prepared using conventional oligonucleotide 

15 synthesis methods known to those skilled in the art. 

Oligonucleotides complementary to and hybridizable 
with any portion of the MTl and MT2 mRNA transcripts 
are, in principle, effective for inhibiting translation 

20 of the transcript as described herein. For example, as 
described in U.S. Pat. No. 5,098,890, issued March 24, 
1992, the disclosure of which is incorporated herein by 
reference, oligonucleotides complementary to mRNA at or 
near the translation initiation codon site may be used 

25 to advantage to inhibit translation. Moreover, it has 
been suggested that sequences that are too distant in 
the 3' direction from the translation initiation site 
may be less effective in hybridizing the mRNA 
transcripts because of potential ribosomal "read- 

30 through", a phenomenon whereby the ribosome is 

postulated to unravel the antisense/sense duplex to 
permit translation of the message. ~ ~ 
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Representative antisense sequences for MTl and MT2 
transcripts are described in Seq. ID No. 5 (MTl) and 
Seq. ID No. 6 (MT2). The antisense sequences are 
complementary the sequence encoding the N-terminus of 
5 either the MTl or MT2 marker protein, as well as part 
of the 5' untranslated sequences immediately upstream 
of the initiation codon. (See Section I.B, above for a 
detailed description of these sequences). As will be 
appreciated by those skilled in the art, antisense 
10 oligonucleotides complementary to other regions of the 
MTl and/or MT2 transcripts are readily created using 
for example, the sequences presented in Seq. ID No. 1 
and 3 as templates. 

15 As described in Section I.B above, any length 

oligonucleotide may be utilized to hybridize to mRNA 
transcripts. However, very short sequences (e.g., less 
than 8-15 nucleotides) may bind with less specificity. 
Moreover, for in vivo use such short sequences may be 

20 particularly susceptible to enzymatic degradation. In 
addition, where oligonucleotides are to be provided 
directly to the cells, very long sequences may be less 
effective at inhibition because of decreased uptake by 
the target cell. Accordingly, where the 

25 oligonucleotide is to be provided directly to target 
cells, oligonucleotides having a length within the 
range of 8-50 nucleotides, preferably 15-30 
nucleotides, are envisioned to be most advantageous - 

30 An alternative means for providing antisense 

sequences to a target cell is as part of a gene therapy 
technique, e.g., as a DNA sequence, preferably part of 
a vector, and associated with a promoter capable of 
expressing the antisense sequence, preferably 

35 constitutively , inside the target cell. Recently, 
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Oeller et al. ((1992) Science 254:437-539, the 
disclosure of which is in corporated by reference) 
described the in vivo inhibition of the ACC synthase 
enzyme using a constitutively expressible DNA sequence 
5 encoding an antisense sequence to the full length ACC 
synthase transcript. Accordingly, where the antisense 
sequences are provided to a target cell indirectly, 
e.g., as part of an expressable gene sequence to be 
expressed within the cell, longer oligonucleotide 
10 sequences, including sequences complementary to 

substantially all the protein coding sequence, may be 
used to advantage. 

Finally, also as described in Section I.B, above, 

15 the therapeutically usefully oligonucleotides 

envisioned include not only native oligomers composed 
of naturally occurring nucleotides, but also those 
comprising modified nucleotides to, for example, 
improve stability and lipid solubility and thereby 

20 enhance cellular uptake. For example, it is known that 
enhanced lipid solubility and/or resistance to nuclease 
digestion results by substituting a methyl group or 
sulfur atom for a phosphate oxygen in the 
internucleotide phosphodiester "linkage. 

25 Phosphorothioates ( "S-oligonucleotides" wherein a 
phosphate oxygen is replaced by a sulfur atom), in 
particular, are stable to nuclease cleavage, are 
soluble in lipids, and are preferred, particularly for 
direct oligonucleotide administration. 

30 S-oligonucleotides may be synthesized chemically by the 
known automated synthesis methods described in 

S e c t io n— I— B— ab o ve v ~~ 
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Suitable oligonucleotide sequences for mRNA 
translation inhibition are readily evaluated by a 
standard in vitro assay using standard procedures 
described herein and well characterized in the art. An 
5 exemplary protocol is described below, but others are 
envisioned and may be used to advantage • 

A candidate antisense sequence is prepared as 
provided herein, using standard chemical techniques. 

10 For example, an MTl antisense sequence may be prepared 
having the sequence described by positions 285-315 of 
Sequence ID No. 5 using an Applied Biosystems automated 
DNA Synthesizer, and the oligonucleotide purified 
accordingly to manufacturer's instructions. The 

15 oligonucleotide then is provided to a suitable 

malignant cell line in culture, e.g., ME-180, under 
standard culture conditions, to be taken up by the 
proliferating cells. 

20 Preferably, a range of doses is used to determine 

effective concentrations for inhibition as well as 
specificity of hybridisation. For example, a dose 
range of 0-100 fjg oligonucleotide/ml may be assayed. 
Further, the oligonucleotides may be provided to the 

25 cells in a single transf ection, or as part of a series 
of transf ect ions, 

Antisense efficacy may be determined by assaying a 
change in cell proliferation over time following 
30 transfection, using standard cell counting methodology 
and/or by assaying for reduced expression* of marker 
protein, e.g., by immunofluorescence, as described in 
Section I. A, above. Alternatively, the ability of 
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cells to take up and use thymidine is another standard 
means of assaying for cell division and maybe used 
here, e.g./ using thymidine. Effective antisense 
inhibition should inhibit cell division sufficiently to 
5 reduce thymidine uptake, inhibit cell proliferation, 
and/or reduce detectable levels of marker proteins. 

Useful concentration ranged are envisioned to vary 
according to the nature and extent of the neoplasm, the 

10 particular oligonucleotide utilized, the relative 
sensitivity of the neoplasm to the oligonucleotides, 
and other factors. Useful ranges for a given cell type 
and oligonucleotide may be determined by performing a 
standard dose range experiment as described here. Dose 

15 range experiments also may be performed to assess 
toxicity levels for normal and malignant cells. 
Concentrations from about 1 to 100 pg/ml per 10^ cells 
may be employed to advantage. 

20 For in vivo use, the antisense oligonucleotides may 

be combined with a pharmaceutical carrier, such as a 
suitable liquid vehicle or excipient and an optional 
auxiliary additive or additives- The liquid vehicles 
and excipients are conventional and commercially 

25 available. Illustrative thereof are distilled water, 
physiological saline, aqueous solutions of dextrose, 
and the like. For in vivo cancer therapies, the 
antisense sequences preferably are provided directly to 
the malignant cells, as by injection to the neoplasm 

30 locus. Alternatively, the oligonucletide may be 

administered systemically , provided that the antisense 
sequence is associated~with means for directing the 
sequences to the target malignant cells. 
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In addition to administration with conventional 
carriers, the antisense oligonucleotides may be 
administered by a variety of specialized 
oligonucleotide delivery techniques. For example, 
5 oligonucleotides maybe encapsulated in liposomes, as 
described in Maniatis et al., Mannino et al. (1988) 
BioTechnoloqy i6:682, and Feigner et al. (1989) Bethesda 
Res. Lab. Focus il:2_l • Reconstituted virus envelopes 
also have been successfully used to deliver RNA and DNA 
10 to cells, (see, for example, Arad et. al., (1986) 
Biochem. Biophy. Acta. 859 , 88-94.) 

For therapeutic use in vivo, the antisense 
oligonucleotides are provided in a therapeutically 

15 effective amount, e.g., an amount sufficient to inhibit 
target protein expression in malignant cells. The 
actual dosage administered may take into account 
whether the nature of the treatment is prophylactic or 
therapeutic in nature, the age, weight, health and sex 

20 of the patient, the route of administration, the size 
and nature of the malignancy, as well as other factors. 
The daily dosage may range from about 0.01 to 1,000 mg 
per day. Greater or lesser amounts of oligonucleotide 
may be administered, as required. As will be 

25 appreciated by those skilled in the medical art, 

particularly the chemotherapeutic art, appropriate dose 
ranges for in vivo administration would be routine 
experimentation for a clinician. As a preliminary 
guideline, effective concentrations for in vitro 

30 inhibition of the target molecule may be determined 
first, as described above. 
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II. B PROTEIN INHIBITION 

In another embodiment, the cancer marker protein 
itself may be used as a target molecule • For example, 
5 a binding protein designed to bind the marker protein 
essentially irreversibly can be provided to the 
malignant cells e.g./ by association with a ligand 
specific for the cell and known to be absorbed by the 
cell. Means for targeting molecules to particular 
10 cells and cell types are well described in the 
chemotherapeutic art. 

Binding proteins maybe obtained and tested as 
described in Section I. A above. For example, the 

15 binding portions of antibodies maybe used to advantage. 
Particularly useful are binding proteins identified 
with high affinity for the target protein, e.g., 
greater than about 10^ M" ^ . Alternatively, the DNA 
encoding the binding protein may be provided to the 

20 target cell as part of an expressable gene to be 

expressed within the cell following the procedures used 
for gene therapy protocols well described in the art. 
(see, for example, U.S. Patent No. 4,497,796, and Gene 
Transfer , Vijay R. Baichwal, ed., (1986). It is 

25 anticipated that the complexed INM protein will be 
disabled and can inhibit cell division thereby. 

As described above for antisense nucleotides, for 
in vivo use, suitable binding proteins may be combined 
30 with a suitable pharmaceutical carrier, such as 

physiological saline or other useful carriers well 

eharacteri-zed— in-the-medica-l— art. — T-he-parmaeeutica-l 

compositions may be provided directly to malignant 
cells, e.g., by direct injection, or may be provided 
35 systemically, provided the binding protein is 
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associated with means for targeting the protein to 
target cells. Finally, suitable dose ranges and cell 
toxicity levels may be assessed using standard dose 
range experiments. Therapeutically effective 
5 concentrations may range from 0.1-1,000 mg per day. As 
described above, actual dosages administered may vary 
depending, for example, on the nature of the 
malignanacy, the age, weight and health of the 
individual, as well as other factors. 

10 

II. EXEMPLIFICATION 

The following examples further describe the utility 
of MTl and MT2 as markers for abnormal cell types, and 

15 how the genetic sequences encoding MTl and MT2 proteins 
were isolated and characterized, including the current 
best mode for their cloning and characterization, 
without limiting the scope thereof. For example, INM 
protein expression in E. coli is described herein. 

20 However, other prokayrotic and eukaryotic cell 
expression systems also are contemplated for 
recombinant expression of the proteins described 
herein. Other useful hosts contemplated include 
Saccharomyces , the insect/baculovirus expression 

25 system, and mammalian cells such as xenogenic myeloma 
cells and the well-characterized Chinese ^hamster ovary 
cell lines. 
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MTl 

As demonstrated below, MTl expression levels are 
enhanced significantly in a number of different 
malignant cell types, including malignant breast, 
5 colon, bladder, ovary, prostate and cervix cell types. 
Presented below are the results of a standard 
immunoassy (precision +5%), performed as described 
herein and in PCT publication WO93/03497 on nuclear 
matrix (NM) preparations made from normal and malignant 

10 human tissue extracts and which were prepared 

essentially as described therein (in 8M urea, 2% p- 
mercaptoethanol, 2% Nonidet P-40 (detergent).) The 
302.47 antibody was raised against a NM preparation 
from CaSki, a cultured cervical tumor cell line 

15 (American Type Culture Collection, ATCC, Rockville, 
MD). MTl:2-8 was raised against the cloned MTl 
protein. Both antibodies bind to epitopes on the 
protein encoded by Seq. ID No.l, as demonstrated using 
standard binding assays. As can be seen from the 

20 results presented below, MTl is significantly elevated 
in malignant bladder tissue. Blotting experiments also 
indicate MTl levels are elevated in other malignant 
tissues . 

TABLE I 



25 



30 



nq MT-1/ 

Sample Antibody Combination g tissue 

normal bladder 302.47/MTl:2-8 13,500 

bladder cancer 302 .47/MTl :2-8 32,000 

Cloning 



Th'e"~naturaI^s~ource'd"MTl protein~f ir^t was separated 

from human cervical tumor cells essentially following 
the procedure of Penman and Fey described in U.S. 

35 Pat. Nos. 4,882,268 and 4,885,236. Cells from the human 
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cervical tumor cell lines CaSki and ME180 (obtained 
from the American Tissue Culture Collection, ATCC, 
Rockville, MD) were grown to confluence and removed 
from flasks by trypsinization. Suspended cells were 
5 washed twice with phosphate buffered saline (PBS) and 
extracted with cytoskeletal buffer (CSK): 100 mM NaCl, 
300 mM sucrose, 10 mM PIPES, 3 mM MgClj, 1 mM EGTA, 
0.5% Triton X-100, 1.2 mM PMSF for 1 min at A'C, 
followed by extraction in cold RSB (reticulocyte 
10 suspension buffer) /double detergent buffer: 100 mM 

NaCl, 3 mM MgClj, 10 mM Tris, pH 7.4, 1% Tween 40, 0.5% 
deoxy chelate, 1.2 mM PMSF. Alternatively, cells were 
extracted twice with the RBS/double detergent buffer. 
The two extraction protocols produced very similar 
15 preparations. The extracted cells were digested for 30 
min at room temperature in digestion buffer: 50mM 
NaCl, 300 mM sucrose, 0.5% Triton X-100, 10 mM PIPES 
(pH 6.8), 3 mM MgCljr linM EGTA, 1.2 mM PMSF, containing 
100 ^J(3 of both RNase A and DNase I. Chromatin was 
20 extracted from the digested nuclei by the addition of 
2 M ammonium sulfate to a final concentration of 0.25 
M. The extracted nuclear matrix-intermediate filament 
(NM-IF) scaffolds then were sedimented at 3700 x g for 
15 min. 

25 

The resulting pellet then was resuspended in 
disassembly buffer: 8 M urea, 20 mM MES (pH 6.6), 1 mM 
EGTA, 1.2mM PMSF, 0.1 mM MgClj/ 1% 2-mercaptoethanol, 
and the pellet sonicated and dialyzed overnight with 

30 3 changes of 2000 volumes of assembly buffer: 0.15 M 
KCl, 25 mM imidazole (pH 7.1), 5 mM MgClj/ 2 mM DTT, 
0.125 mM EGTA, 0.2 mM PMSF. The dialysate then was 
centrifuged at 100k x g for 1 h and the NM proteins 
recovered from the supernatant. Alternatively, NM-IF 

35 scaffolds were extracted directly with E400 buffer: 
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0.4 M NaCl, 0.02 M Tris pH 7.5, 0.1 mM MClj. 0.5% 
2-mercaptoethanol, 1.2 mM PMSF, for 30 min at 4**C, as 
described by von Kries et al. (1991) Cell 64:123-135. 
The intermediate filament-rich pellet then was removed 
5 after centrifugation for 90 min at 40K rpm in a Beckman 
70.1 Ti rotor. The supernatant remaining is enriched 
in MTl protein with little cytokeratin contamination. 

MTl-specific antibodies were produced by standard 

10 procedures. Specifically, Balb/c by J mice (Jackson 
Laboratory, Bar Harbor, ME) were injected 
intraperitoneally with purified Caski NM protein every 
2 weeks for a total of 16 weeks. The mice were 
injected with a single boost 4 days prior to sacrifice 

15 and removal of the spleen. Freund's complete adjuvant 
was used in the first injection, incomplete Freund's in 
the second injection; subsequent injections were made 
with saline. Spleen cells were fused with the SP2/0- 
Agl4 mouse myeloma line (ATCC, Rockville, MD) using the 

20 standard fusion methodologies well known in the art. 
Hybridomas producing antibodies that reacted with 
nuclear matrix proteins were cloned and grown as 
ascites. Antigen specificity was assessed both by 
immunof lourescence spectroscopy and Western blot 

25 analysis. The 302.47 antibody was used to screen an 
expression library as described below to isolate the 
MTl ^ene. 

The cDNA clones for MTl were obtained from a Lambda 
30 ZAP expression library (Stratagene, La Jolla, CA) . 
Library screening was carried out according to the 
manufacturer '"s~ins true tions and using the MTl-speci i f c 
antibody 302.47. Briefly, a single positive clone 
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containing a 2.45 kb insert was identified and 
subcloned into pBluescript II vectors (Stratagene, 
La Jolla, CA) opened at the EcoRI and Xhol cloning 
sites. The resulting plasmid, pMTl, was sequenced 
5 directly and further subcloned to produce the MTl 
fusion protein (see below). 

The cDNA sequences were obtained using the standard 
dideoxy method described in the art. Double stranded 
10 sequencing was done utilizing the pMTl vector primed 
with appropriate primers according to manufacturer's 
instructions (Stratagene, La Jolla, CA). Internal 
sequences were obtained using synthetic primers, 
created based on the identified sequence. 

15 

The entire nucleotide sequence and predicted amino 
acid sequence for MTl are shown in Seq. ID No. 1. The 
cDNA clone retains a polyadenylation signal a putative 
initiation codon, a continuous open reading frame and 

20 codon utilization consistent with a human gene. The 
predicted amino acid sequence of MTl consists of 639 
amino acids encoding a protein of 70.5 kD with a pl of 
5.47. The primary structure, as predicted by the Chou- 
Fasman algorithm (Chou and Fasman, (1978) Adv. Enzymol . 

25 Relat. Areas Mol. Biol . 42:145-148), consists of 72% 
alpha helix of which 56% is extended helix. 

The primary structure of MTl, represented in 
Fig. 1, contains 27 proline residues which generally 

30 occur in pairs or triplets throughout the molecule. 
The proline distribution within the sequence is 
illustrated in Fig. lA, where diamonds represent the 
proline residues. Proline pairs and triplets are 
indicted by stacked diamonds. At the N terminus, a 

35 40 amino acid stretch contains a cluster of 8 prolines 
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(residues 42-81, Seq. ID No. 1) that occur as pairs 
separated by 3 or fewer amino acids. A similar 
proline-rich region occurs in the C terminus of MTl 
(residues 551-563) where 6 prolines occur in a 13 amino 
5 acid stretch. Both proline-rich regions likely lie on 
the protein surface, based on probability calculations 
determined by the technique of Emini et al. (1985) J. 
Virol. 55^:836-839. The high proline density also may 
explain the anomalous apparent molecular weight of the 

10 protein as determined by SDS polyacrylamide gel 

electrophoresis. As described above, the predicted 
molecular weight for MTl, calculated from the amino 
acid sequence, is 70.1 kD. However, as described 
below, both the natural-sourced and recombinant protein 

15 migrate as a 90 kD protein on an SDS polyacrylamide 
gel. Alternatively, it is also possible that the 
molecular weight variation may result from some post- 
translational modification achievable in both 
prokaryotic and eukaryotic cells. 

20 

Between the two proline-rich termini, MTl displays 
a sequence consistent with a region of. extended alpha 
helix structure, indicated by the hatched structure in 
Fig. IB. The extended helix is interrupted in 4 places 

25 by short helix-distorting amino acid stretches that 
usually include a pair of proline residues. A 
preliminary hypothesis as to the structure of MTl based 
on these theoretical calculations is that the molecule 
consists of an extended rod that is bounded on either 

30 end by a globular, proline-rich domain. 



Analysis"of-al-l~ava"i11abl"e"sequence"databa's'e^^ 

indicates that MTl has a novel sequence that bears no 
significant homology to any known protein. In 
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addition, the sequence appears to lack any known, 
identifiable DNA binding motif such as the leucine 
zipper motif. 

5 The cloned MTl DNA was used to perform standard 

Northern blot analysis of total and poly A+ RNA from 
ME180 cells, using standard procedures and 15 pg RNA. 
After blotting and hybridization with ^^P-labelled pMTl 
DNA, a single mRNA band was detected in the poly A+ 

10 fraction. This band was not apparent in the total RNA 
lane after a 48 h exposure of the autoradiogram, 
indicating that the MTl message is a low abundance 
species. Northern blot analysis indicates that the MTl 
protein is translated from a single mRNA. Northern 

15 blot analysis also indicates that the MTl RNA includes 
approximately 500 bp 5' of the protein-coding sequence 
presented in Seq. ID No. 1. This upstream sequence 
may represent one or more untranslated sequences and/or 
may encode additional protein coding sequences. 

20 

A fusion protein for MTl was obtained using the 
insert from the pMTl c6nstruct described above and in 
Seq. ID No. 1, and the pMAL expression system (New 
England Biolabs Inc., Beverly, MA). In this system the 

25 gene of interest (MTl) is cloned into the pMal-c vector 
(New England Biolabs Inc., Beverly, MA) and the vector 
trans fected into E. coli and expressed to produce a 
fusion protein containing both the protein of interest 
and the maltose binding protein. The maltose binding 

30 protein allows the fusion protein to be selectively 
purified in the presence of maltose and can be 
subsequently cleaved by proteolytic clavage with Factor 
Xa to yield intact, recombinant MTl protein. Here, MTl 
cDNA was cloned into the pMAL-c vector such that the 

35 initiation AUG codon was directly continuous with the 
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5' terminus of the maltose binding protein. After 
proteolytic cleavage with factor Xa the resulting MTl 
fusion protein retains the complete amino acid sequence 
encoded by the MTl cDNA with no additional amino acids. 
5 All experimental details of the pMAL system were 
carried out according to the manufacturer's 
instructions . 

As described above, both the natural-sourced and 
10 recombinantly produced protein have an electrophoretic 
mobility consistent with an apparent molecular weight 
of about 90kD on SDS-PAGE. In addition, the pi of both 
proteins is equivalent (5.4) and consistent with the 
predicted pi as calculated from the amino acid 
15 sequence. Peptide mapping of both proteins by cleavage 
at cysteine residues with 2-nitro-5-thiocyanobenzoic 
acid (NTCB), following the method of Leto and Marchesi 
(1984) J. Biol. Chem. 259 ; 4603-4049 , yields equivalent 
peptide fragments which share the same MTl cross 
20 reactivity by Western blot analysis. Moreover, the 
number and size of the peptide fragments produced are 
consistent with those predicted from the proposed MTl 
amino acid sequence. 

25 

MT2 

Like MTl, MT2 expression levels are enhanced 
significantly in a variety of malignant cell types, as 
30 determined both by serum assays and tissue culture 
supernatant assays. In the assays described below, 

th'e~^^tll}Oxil"e^~^^'^^~^^^^~^^^^ 

cervical tumor cell line NM preparations (ME-180 and 
CaSKi, ATTC, Rockville, MD. ) The 100-series antibodies 
35 axe those raised against the ME-180 immunogen; the 300- 
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series antibodies are those raised against CaSKi-NM 
immunogen. Of the antibodies described below, 107.7 
and 307.33 have been determined to bind specifically 
with the MT2 protein, and 302-18, 302-22 and 302-29 
cross react with a protein closely associated with MT2 
and which co-isolates with it. 



10 



15 



20 



25 



30 



40 



45 



Dose response evaluation results of two antibody 
combinations are shown in Table II, below, using ME-180 
cell culture supernatant as the antigen source. Each 
assay shows dose dependent detection of antigen in the 
tissue culture supernatant, demonstrating the ability 
of the assay to quantitate soluble interior nuclear 
matrix protein released from dying cells. 

Table II 

Antibody 107-7 solid phase, 302-29 soluble antibody, 
ME-180 supernatant 



Concentration 
of supernatant 

2:1 

undiluted 

1:2 

1:4 

1:8 

1:16 

No Sup 



Mean OP 

0.501 
0.274 
0.127 
0.067 
0.035 
0.021 
0.000 



SD 

0.013 
0.018 
0.006 
0.006 
0.009 
0.007 



Antibody 107-7 solid phase, 307-33 soluble antibody, 
ME-180 supernatant. 



35 Concentration 
of supernatant 



3:1 

3:2 

3:4 

3:8 

3:16 

3:32 

No Sup 



Mean OP 

0.906 
0.456 
0.216 
0.099 
0.052 
0.031 



SP 

0.009 
0.011 
0.007 
0.005 
0.002 
0.005 
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Next, interior nuclear matrix protein 
quantification was tested in supernatant from a variety 
of dying tumor tissues. Here, tumor and normal tissues 
were allowed to die in media by serum deprivation. 
5 Specifically, cell lines were grown to confluency in 
tissue culture flasks by standard culturing techniques. 
The media then was replaced with serum- free media and 
the cells placed in a 37 '^C incubator with 5% for 7 

to 14 days. At the end of the incubation the media was 
10 collected and centrifuged at 14,000xg to remove 

cellular debris. Supernatants were assayed in various 
configurations of sandwich assays. 

The results are shown in Fig. 3, where all values 
are in units/gm, using ME-180 antigen as standard. As 
can be seen from Fig. 3, MT2 antigen is released from 
each of the dying tissues and the increased cell death 
in tumor tissue is reflected in a higher MT2 average 
antigen value quantitated in cancer tissue versus 
normal tissue. 

Figure 4 shows the results of an analogous 
experiment performed using serum samples from cancer 
patients and normal blood donors. Here tissue is 
25 prepared as follows. Tissue is removed from a donor, 
flash frozen in liquid nitrogen within lOmin to 4hrs 
after removal and stored at -70°c until required. When 
ready to be used, the tissue is chopped into 0.1 to 0.3 
cm cubes as it thaws using aseptic techniques in a 
30 laminar flow hood and placed in a T150 flask containing 
serum free media containing Fungizone and gentamycin. 
In generalT~2^4g~^f "tissue are used per TOOml media in 
the T150 flask. The flask containing the tissue then 
is incubated for 4-7 days at 37**c with 5% CO2. After 
33 incubation the media is collected from the flasks. 



15 



20 
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centrifuged at 14,000xg for 20min. As for Fig. 3^ ME- 
180 cell antigen was the standard. Results are 
reported in units/ml. A control experiment diluting 
supernatant antigen into serum and then quantitating 
5 the protein in solution indicates that serum has little 
or no effect on the assay. As can be seen in the 
results presented in Fig. A, like the results shown in 
Fig. 3, serum samples from cancer patients reflect a 
higher rate of cell death as indicated by the 
10 quantifiably higher levels of MT2 antigen detected in 
these samples compared with those detected in the 
normal blood serum samples. 

Cloning 

15 Following the same general procedure as for MTl, a 

composition selectively enriched for MT2 was obtained 
from ME-180 cells (cervical carcinoma cells, from ATCC, 
Rockville, MD), and MT2-specific antibodies prepared. 
The 107.7 antibody was used to obtained a partial cDNA 

20 clone for MT2, by screening a lambda ZAP expression 

library, as for MT-1. The partial clone retrieved then 
was subcloned into a pBluescript II vector (pMT2) and 
the MT2 cDNA sequenced using standard techniques. The 
sequenced DNA, which corresponds to residues 1366 to 

25 2865 of Seq. ID No. 3, then was analyzed to determine 
the reading frame and encoded amino acid -sequence. The 
complete coding sequence subsequently was determined 
and is presented in Seq. ID No. 3. (Compton et al. 
(1992) J. Cell Biol. 116 ; 1395-1408). The nucleotide 

30 sequence and predicted amino acid sequence for MT2 are 
described in Seq. ID No. 3. 
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The primary structure of MT2 is represented 
schematically in Fig. 2: The protein appears to 
comprise at least 6 helical regions separated by 
proline pairs, (See Fig. 4A and B.) The primary 
5 structure may allow the protein to form a coiled-coil 
structure in solution. As for Fig. 3, prolines are 
indicated by diamonds and helices by hatched boxes. In 
addition, both the N and C termini of MT2 appear to 
fold as globular domains (Compton et al. (1992) J> Cell 
10 Biol. 116 ; 1395-1408.) 

The invention may be embodied in other specific 
forms without departing from the spirit or essential 
characteristics thereof. The present embodiments are 

15 therefore to be considered in all respects as 

illustrative and not restrictive, the scope of the 
invention being indicated by the appended claims rather 
than by the foregoing description, and all changes 
which come within the meaning and range of equivalency 

20 of the claims are therefore intended to be embraced 
therein. 



wo 94/00573 



PCr/US93/06160 



- 49 - 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Hatritech, Inc. 

(B) STREET: 763D Concord Ave 

(C) CITY: Cambridge 

(D) STATE: Massachusetts 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP): 02138 

(G) TELELPHONE: 1-617-661-6660 

(H) TELEFAX: 1-617-661-8522 

(I) TELEX: 

(ii) TITLE OF INVENTION: NOVEL MALIGNANT CELL TYPE MARKERS 
OF THE INTERIOR NUCLEAR MATRIX 

(iii) NUMBER OF SEQUENCES: 6 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: TESTA HURUITZ & THIBEAULT 

(B) STREET: 53 STATE STREET 

(C) CITY: BOSTON 

(D) STATE: MA 

(E) COUNTRY: USA 

(F) ZIP: 02109 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: PITCHER ESQ, EDMUND R 

(B) REGISTRATION NUMBER: 27,829 

(C) REFERENCE/DOCKET NUMBER: MTP-013 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 617/248-7000 

(B) TELEFAX: 617/2A8-7100 
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(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2360 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: HOMO SAPIENS 
(F) TISSUE TYPE: CERVIX TUMOR 



GAGATGGTTC TTGGTCCTGC AGCTTATAAT GTTCCATTGC CAAAGAAATC GATTCAGTCG 60 

GGTCCACTAA AAATCTCTAG TGTATCAGAA GTA ATG AAA GAA TCT AAA CAG CCT 114 

Met Lys Glu Ser Lys Gin Pro 
1 5 

GCC TCA CAA CTC CAA AAA CAA AAG GGA GAT ACT CCA GCT TCA GCA ACA 162 
Ala Ser Gin Leu Gin Lys Gin Lys Gly Asp Thr Pro Ala Ser Ala Thr 
10 15 20 

GCA CCT ACA GAA GCG GCT CAA ATT ATT TCT GCA GCA GGT GAT ACC CTG 210 
Ala Pro Thr Glu Ala Ala Gin He He Ser Ala Ala Gly Asp Thr Leu 
25 30 35 

TCG GTC CCA GCC CCT GCA GTT CAG CCT GAG GAA TCT TTA AAA ACT GAT 258 
Ser Val Pro Ala Pro Ala Val Gin Pro Glu Glu Ser Leu Lys Thr Asp 
40 45 50 55 

CAC CCT GAA ATT GGT GAA GGA AAA CCC ACA CCT GCA CTT TCA GAA GCA 306 
His Pro Glu He Gly Glu Gly Lys Pro Thr Pro Ala Leu Ser Glu Ala 
60 65 70 

TCC TCA TCT TCT ATA AGG GAG CGA CCA CCT GAA GAA GTT GCA GCT CGC 354 
Ser Ser Ser Ser He Arg Glu Arg Pro Pro Glu Glu Val Ala Ala Arg 
.75 80 85 

CTT GCA CAA CAG GAA AAA CAA GAA CAA GTT AAA ATT GAG TCT CTA GCC 402 
Leu Ala Gin Gin Glu Lys Gin Glu Gin Val Lys He Glu Ser Leu Ala 
90 95 100 

AAG AGC TTA GAA GAT GCT CTG AGG CAA ACT GCA AGT GTC ACT CTG CAG 450 
Lys Ser Leu Glu Asp Ala Leu Arg Gin Thr Ala Ser Val Thr Leu Gin 

105 HO H5 

GCT ATT GCA GCT CAG AAT GCT GCG GTC CAG GCT GTC AAT GCA CAC TCC 498 
Ala He Ala Ala Gin Asn Ala Ala Val Gin Ala Val Asn Ala His Ser 
120 125 130 135 
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AAC ATA TTG AAA GCC GCC ATG GAC AAT TCT GAG ATT GCA GGC GAG AAG 546 
Asn He Leu Lys Ala Ala Met Asp Asn Ser Glu He Ala Gly Glu Lys 
140 145 150 

AAA TCT GCT CAG TGG CGC ACA GTG GAG GGT GCA TTG AAG GAA CGC AGA 594 
Lys Ser Ala Gin Trp Arg Thr Val Glu Gly Ala Leu Lys Glu Arg Arg 
155 160 165 

AAG GCA GTA GAT GAA GCT GCC GAT GCC CTT CTC AAA GCC AAA GAA GAG 642 
Lys Ala Val Asp Glu Ala Ala Asp Ala Leu Leu Lys Ala Lys Glu Glu 
170 175 180 

TTA GAG AAG ATG AAA AGT GTG ATT GAA AAT GCA AAG AAA AAA GAG GTT 690 
Leu Glu Lys Met Lys Ser Val He Glu Asn Ala Lys Lys Lys Glu Val 
185 190 195 

GCT GGG GCC AAG CCT CAT ATA ACT GCT GCA GAG GGT AAA CTT CAC AAC 738 
Ala Gly Ala Lys Fro His He Thr Ala Ala Glu Gly Lys Leu His Asn 
200 205 210 215 

ATG ATA GTT GAT CTG GAT AAT GTG GTC AAA AAG GTC CAA GCA GCT CAG 786 
Met He Val Asp Leu Asp Asn Val Val Lys Lys Val Gin Ala Ala Gin 
220 225 230 

TCT GAG GCT AAG GTT GTA TCT CAG TAT CAT GAG CTG GTG GTC CAA GCT 834 
Ser Glu Ala Lys Val Val Ser Gin Tyr His Glu Leu Val Val Gin Ala 
235 240 245 

CGG GAT GAC TTT AAA CGA GAG CTG GAC AGT ATT ACT CCA GAA GTC CTT 882 
Arg Asp Asp Phe Lys Arg Glu Leu Asp Ser He Thr Pro Glu Val Leu 
250 255 260 

CCT GGG TGG AAA GGA ATG AGT GTT TCA GAC TTA GCT GAC AAG CTC TCT 930 
Pro Gly Trp Lys Gly Met Ser Val Ser Asp Leu Ala Asp Lys Leu Ser 
265 270 275 

ACT GAT GAT CTG AAC TCC CTC ATT GCT CAT GCA CAT CGT CGT ATT GAT 978 
Thr Asp Asp Leu Asn Ser Leu He Ala His Ala His Arg Arg He Asp 
280 285 290 295 

CAG CTG AAC AGA GAG CTG GCA GAA CAG AAG GCC ACC GAA AAG CAG CAC 1026 
Gin Leu Asn Arg Glu Leu Ala Glu Gin Lys Ala Thr Glu Lys Gin His 
300 305 310 

ATC ACG TTA GCC TTG GAG AAA CAA AAG CTG GAA GAA AAG CGG GCA TTT 1074 
He Thr Leu Ala Leu Glu Lys Gin Lys Leu Glu Glu Lys Arg Ala Phe 
315 320 325 

GAC TCT GCA GTA GCA AAA GCA TTA GAA CAT CAC AGA AGT GAA ATA CAG 1122 
Asp Ser Ala Val Ala Lys Ala Leu Glu His His Arg Ser Glu He Gin 
330 335 340 
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GCT GAA CAG GAC AGA AAG ATA GAA GAA GTC AGA GAT GCC ATG GAA AAT 1170 
Ala Glu Gin Asp Arg Lys He Glu Glu Val Arg Asp Ala Met Glu Asn 
345 350 355 

GAA ATG AGA ACC CCT TCG CCG ACA GCA GCT GCC CAC ACT GAT CAC TTG 1218 
Glu Met Are Thr Pro Ser Pro Thr Ala Ala Ala His Thr Asp His Leu 
360 365 370 375 

CGA GAT GTC CTT AGG GTA CAA GAA CAG GAA TTG AAG TCT GAA TTT GAG 1266 
Are AsD Val Leu Arg Val Gin Glu Gin Glu Leu Lys Ser Glu Phe Glu 
^ ^ 380 385 390 

CAG AAC CTG TCT GAG AAA CTC TCT GAA CAA GAA TTA CAA TTT CGT CGT 1314 
Gin Asn Leu Ser Glu Lys Leu Ser Glu Gin Glu Leu Gin Phe Arg Arg 
\9S 400 405 

CTC AGT CAA GAG CAA GTT GAC AAC TTT ACT CTG GAT ATA AAT ACT GCC 1362 
Leu Ser Gin Glu Gin Val Asp Asn Phe Thr Leu Asp He Asn Thr Ala 
410 415 420 

TAT GCC AGA CTC AGA GGA ATC GAA CAG GCT GTT CAG AGC CAT GCA GTT 1410 
Tyr Ala Arg Leu Arg Gly He Glu Gin Ala Val Gin Ser His Ala Val 
425 430 435 

GCT GAA GAG GAA GCC AGA AAA GCC CAC CAA CTC TGG CTT TCA GTG GAG 1458 
Ala Glu Glu Glu Ala Arg Lys Ala His Gin Leu Trp Leu Ser Val Glu 
440 445 450 455 

GCA TTA AAG TAC AGC ATG AAG ACC TCA TCT GCA GAA ACA CCT ACT ATC 1506 
Ala Leu Lys Tyr Ser Met Lys Thr Ser Ser Ala Glu Thr Pro Thr He 
460 465 470 

CCG CTG GGT AGT GCG GTT GAG GCC ATC AAA GCC AAC TGT TCT GAT AAT 1554 
Pro Leu Gly Ser Ala Val Glu Ala He Lys Ala Asn Cys Ser Asp Asn 
475 480 485 

GAA TTC ACC CAA GCT TTA ACC GCA GCT ATC CCT CCA GAG TCC CTG ACC 1602 
Glu Phe Thr Gin Ala Leu Thr Ala Ala He Pro Pro Glu Ser Leu Thr 
490 495 500 

CGT GGG GTG TAC AGT GAA GAG ACC CTT AGA GCC CGT TTC TAT GCT GTT 1650 
Arg Gly Val Tyr Ser Glu Glu Thr Leu Arg Ala Arg Phe Tyr Ala Val 
505 510 515 

CAA AAA CTG GCC CGA AGG GTA GCA ATG ATT GAT GAA ACC AGA AAT AGC 1698 

Gin Lys Leu Ala Arg Arg Val Ala Met He Asp Glu Thr Arg Asn Ser 

520 525 530 535 



TTG TAC CAG TAC TTC CTC TCC TAC CTA CAG TCC CTG CTC CTA TTC CCA 
Leu Tyr Gin Tyr Phe Leu Ser Tyr Leu Gin Ser Leu Leu Leu Phe Pro 
540 545 550 



1746 
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CCT CAG CAA CTG AAG CCG CCC CCA GAG CTC TGC CCT GAG GAT ATA AAC 1794 
Pro Gin Gin Leu Lys Pro Pro Pro Glu Leu Cys Pro Glu Asp lie Asn 
555 560 565 

ACA TIT AAA TTA CTG TCA TAT GCT TCC TAT TGC ATT GAG CAT GGT GAT 1842 
Thr Phe Lys Leu Leu Ser Tyr Ala Ser Tyr Cys He Glu His Gly Asp 
570 575 580 

CTG GAG CTA GCA GCA AAG TTT GTC AAT CAG CTG AAG GGG GAA TCC AGA 1890 
Leu Glu Leu Ala Ala Lys Phe Val Asn Gin Leu Lys Gly Glu Ser Arg 
585 590 595 

CGA GTG GCA CAG GAC TGG CTG AAG GAA GCC CGA ATG ACC CTA GAA ACG 1938 
Arg Val Ala Gin Asp Trp Leu Lys Glu Ala Arg Met Thr Leu Glu Thr 
600 605 610 615 

AAA CAG ATA GTG GAA ATC CTG ACA GCA TAT GCC AGC GCC GTA GGA ATA 1986 
Lys Gin He Val Glu He Leu Thr Ala Tyr Ala Ser Ala Val Gly He 
620 625 630 

GGA ACC ACT CAG GTG CAG CCA GAG TGAGGTTTAG GAAGATTTTC ATAAAGTCAT 2040 
Gly Thr Thr Gin Val Gin Pro Glu 
635 

ATTTCATGTC AAAGGAAATC AGCAGTGATA GATGAAGGGT TCGCAGCGAG AGTCCCGGAC 2100 

TTGTCTAGAA ATGAGCAGGT TTACAAGTAC TGTTCTAAAT GTTAACACCT GTTGCATTTA 2160 

TATTCTTTCC ATTTGCTATC ATGTCAGTGA ACGCCAGGAG TGCTTTCTTT GCAACTTGTG 2220 

TAACATTTTC TGTTTTTTCA GGTTTTACTG ATGAGGCTTG TGAGGCCAAT CAAAATAATG 2280 

TTTGTGATCT CTACTACTGT TGATTTTGCC CTCGGAGCAA ACTGAATAAA GCAACAAGAT 2340 

GAAAAAAAAA AAAAAAAAAA _ 2360 

(2) INFORMATION FOR SEQ ID N0;2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 639 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Met Lys Glu Ser Lys Gin Pro Ala Ser Gin Leu Gin Lys Gin Lys Gly 
15 10 15 

Asp Thr Pro Ala Ser Ala Thr Ala Pro Thr Glu Ala Ala Gin lie He 
20 25 30 



wo 94/00573 PCr/US93/06160 



- 54 - 



Ser Ala Ala Gly Asp Thr Leu Ser Val Pro Ala Pro Ala Val Gin Pro 
35 40 45 

Glu Glu Ser Leu Lys Thr Asp His Pro Glu He Gly Glu Gly Lys Pro 
50 55 60 

Thr Pro Ala Leu Ser Glu Ala Ser Ser Ser Ser He Arg Glu Arg Pro 
65 70 75 80 

Pro Glu Glu Val Ala Ala Arg Leu Ala Gin Gin Glu Lys Gin Glu Gin 
85 90 95 

Val Lys He Glu Ser Leu Ala Lys Ser Leu Glu Asp Ala Leu Arg Gin 
100 105 110 

Thr Ala Ser Val Thr Leu Gin Ala He Ala Ala Gin Asn Ala Ala Val 
115 120 125 

Gin Ala Val Asn Ala His Ser Asn He Leu Lys Ala Ala Met Asp Asn 
130 135 140 

Ser Glu He Ala Gly Glu Lys Lys Ser Ala Gin Trp Arg Thr Val Glu 
145 150 155 160 

Gly Ala Leu Lys Glu Arg Arg Lys Ala Val Asp Glu Ala Ala Asp Ala 
165 170 175 

Leu Leu Lys Ala Lys Glu Glu Leu Glu Lys Met Lys Ser Val He Glu 
180 185 190 

Asn Ala Lys Lys Lys Glu Val Ala Gly Ala Lys Pro His He Thr Ala 
195 200 205 

Ala Glu Gly Lys Leu His Asn Met He Val Asp Leu Asp Asn Val Val 
210 215 220 

Lys Lys Val Gin Ala Ala Gin Ser Glu Ala Lys Val Val Ser Gin Tyr 
225 230 235 2A0 

His Glu Leu Val Val Gin Ala Arg Asp Asp Phe Lys Arg Glu Leu Asp 
245 250 255 

Ser He Thr Pro Glu Val Leu Pro Gly Trp Lys Gly Met Ser Val Ser 
260 265 270 

Asp Leu Ala Asp Lys Leu Ser Thr Asp Asp Leu Asn Ser Leu He Ala 
275 280 285 



His Ala His Arg Arg He Asp Gin Leu Asn Arg Glu Leu Ala Glu Gin 
290 295 300 
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Lys Ala Thr Glu Lys Gin His He Thr Leu Ala Leu Glu Lys Gin Lys 
305 310 315 320 

Leu Glu Glu Lys Are Ala Phe Asp Ser Ala Val Ala Lys Ala Leu Glu 
325 330 335 

His His Are Ser Glu He Gin Ala Glu Gin Asp Arg Lys He Glu Glu 
340 345 350 

Val Arg Asp Ala Met Glu Asn Glu Met Arg Thr Pro Ser Pro Thr Ala 
355 360 365 

Ala Ala His Thr Asp His Leu Arg Asp Val Leu Arg Val Gin Glu Gin 
370 375 380 

Glu Leu Lys Ser Glu Phe Glu Gin Asn Leu Ser Glu Lys Leu Ser Glu 
385 390 395 400 

Gin Glu Leu Gin Phe Arg Arg Leu Ser Gin Glu Gin Val Asp Asn Phe 
405 410 415 

Thr Leu Asp He Asn Thr Ala Tyr Ala Arg Leu Arg Gly He Glu Gin 
420 425 430 

Ala Val Gin Ser His Ala Val Ala Glu Glu Glu Ala Arg Lys Ala His 
435 440 445 

Gin Leu Trp Leu Ser Val Glu Ala Leu Lys Tyr Ser Met Lys Thr Ser 
450 455 460 

Ser Ala Glu Thr Pro Thr He Pro Leu Gly Ser Ala Val Glu Ala He 
465 470 475 480 

Lys Ala Asn Cys Ser Asp Asn Glu Phe Thr Gin Ala Leu Thr Ala Ala 
485 490 495 

He Pro Pro Glu Ser Leu Thr Arg Gly Val Tyr Ser Glu Glu Thr Leu 
500 505 510 

Arg Ala Arg Phe Tyr Ala Val Gin Lys Leu Ala Arg Arg Val Ala Met 
515 520 525 

He Asp Glu Thr Arg Asn Ser Leu Tyr Gin Tyr Phe Leu Ser Tyr Leu 
530 535 540 

Gin Ser Leu Leu Leu Phe Pro Pro Gin Gin Leu Lys Pro Pro Pro Glu 
545 550 555 560 

Leu Cys Pro Glu Asp He Asn Thr Phe Lys Leu Leu Ser Tyr Ala Ser 
565 570 575 
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Tyr Cys He Glu His Gly Asp Leu Glu Leu Ala Ala Lys Phe Val Asn 
580 585 590 

Gin Leu Lys Gly Glu Ser Arg Arg Val Ala Gin Asp Trp Leu Lys Glu 
595 600 605 

Ala Arg Met Thr Leu Glu Thr Lys Gin He Val Glu He Leu Thr Ala 
610 615 620 

Tyr Ala Ser Ala Val Gly He Gly Thr Thr Gin Val Gin Pro Glu 
625 630 635 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6306 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..6306 

(X) PUBLICATION INFORMATION: 

(A) AUTHORS: COMPTON, DUANE A; SZILAK, ILYA; CLEVELAND, DON U. 

(B) PRIMARY STRUCTURE OF NUMA. . . 

(C) JOURNAL: Journal of Cell Biology 

(D) VOLUME: 116 

(E) RELEVANT RESIDUES IN SEQ ID NO: 3: FROM 1 TO 6306 

(F) PAGES: 1395-1A08 

(G) DATE: MAR - 1992 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATG ACA CTC CAC GCC ACC CGG GGG GCT GCA CTC CTC TCT TGG GTG AAC 48 
Met Thr Leu His Ala Thr Arg Gly Ala Ala Leu Leu Ser Trp Val Asn 
15 10 15 

AGT CTA CAC GTG GCT GAC CCT GTG GAG GCT GTG CTG CAG CTC CAG GAC 96 
Ser Leu His Val Ala Asp Pro Val Glu Ala Val Leu Gin Leu Gin Asp 
20 25 30 

— TGC"AGC~ATC"TTC"ATC~AAG~ATC~ATT~GAC-AGA-ATC-CAT-GGC-"ACT-GAA-GAG r4'4 
Cys Ser He Phe He Lys He He Asp Arg He His Gly Thr Glu Glu 
35 40 45 
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GGA CAG CAA ATC TTG AAG CAG CCG GTG TCA GAG AGA CTG GAC TTT GTG 
Gly Gin Gin He Leu Lys Gin Pro Val Ser Glu Arg Leu Asp Phe Val 
50 55 60 

TGC AGT TTT CTG CAG AAA AAT CGA AAA CAT CCC TCT TCC CCA GAA TGC 
Cys Ser Phe Leu Gin Lys Asn Arg Lys His Pro Ser Ser Pro Glu Cys 
65 70 75 80 

CTG GTA TCT GCA CAG AAG GTG CTA GAG GGA TCA GAG CTG GAA CTG GCG 
Leu Val Ser Ala Gin Lys Val Leu Glu Gly Ser Glu Leu Glu Leu Ala 
85 90 95 

AAG ATG ACC ATG CTG CTC TTA TAC CAC TCT ACC ATG AGC TCC AAA AGT 
Lys Met Thr Met Leu Leu Leu Tyr His Ser Thr Met Ser Ser Lys Ser 
100 105 110 

CCC AGG GAC TGG GAA CAG TTT GAA TAT AAA ATT CAG GCT GAG TTG GCT 
Pro Arg Asp Trp Glu Gin Phe Glu Tyr Lys He Gin Ala Glu Leu Ala 
115 120 125 

GTC ATT CTT AAA TTT GTG CTG GAC CAT GAG GAC GGG CTA AAC CTT AAT 
Val He Leu Lys Phe Val Leu Asp His Glu Asp Gly Leu Asn Leu Asn 
130 135 lAO 

GAG GAC CTA GAG AAC TTC CTA CAG AAA GCT CCT GTG CCT TCT ACC TGT 
Glu Asp Leu Glu Asn Phe Leu Gin Lys Ala Pro Val Pro Ser Thr Cys 
145 150 155 160 

TCT AGC ACA TTC CCT GAA GAG CTC TCC CCA CCT AGC CAC CAG GCC AAG 
Ser Ser Thr Phe Pro Glu Glu Leu Ser Pro Pro Ser His Gin Ala Lys 
165 170 175 

AGG GAG ATT CGC TTC CTA GAG CTA CAG AAG GTT GCC TCC TCT TCC AGT 
Arg Glu He Arg Phe Leu Glu Leu Gin Lys Val Ala Ser Ser Ser Ser 
180 185 190 

GGG AAC AAC TTT CTC TCA GGT TCT CCA GCT TCT CCC ATG GGT GAT ATC 
Gly Asn Asn Phe Leu Ser Gly Ser Pro Ala Ser Pro Met Gly Asp He 
195 200 205 

CTG CAG ACC CCA CAG TTC CAG ATG AGA CGG CTG AAG AAG CAG CTT GCT 
Leu Gin Thr Pro Gin Phe Gin Met Arg Arg Leu Lys Lys Gin Leu Ala 
210 215 220 

GAT GAG AGA AGT AAT AGG GAT GAG CTG GAG CTG GAG CTA GCT GAG AAC 
AsD Glu Arg Ser Asn Arg Asp Glu Leu Glu Leu Glu Leu Ala Glu Asn 
225 230 235 240 

CGC AAG CTC CTC ACC GAG AAG GAT GCA CAG ATA GCC ATG ATG CAG CAG 
Are Lys Leu Leu Thr Glu Lys Asp Ala Gin He Ala Met Met Gin Gin 
245 250 255 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 
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CGC ATT GAC CGC CTA GCC CTG CTG AAT GAG AAG CAG GCG GCC AGC CCA 816 
Are He Asp Are Leu Ala Leu Leu Asn Glu Lys Gin Ala Ala Ser Fro 
260 265 270 

CTG GAG CCC AAG GAG CTT GAG GAG CTG CGT GAC AAG AAT GAG AGC CTT 864 
Leu Glu Pro Lys Glu Leu Glu Glu Leu Arg Asp Lys Asn Glu Ser Leu 
275 280 285 

ACC ATG CGG CTG CAT GAA ACC CTG AAG CAG TGC CAG GAC CTG AAG ACA 912 
Thr Met Arg Leu His Glu Thr Leu Lys Gin Cys Gin Asp Leu Lys Thr 
290 295 300 

GAG AAG AGC CAG ATG GAT CGC AAA ATC AAC CAG CTT TCG GAG GAG AAT 960 
Glu Lys Ser Gin Met Asp Arg Lys He Asn Gin Leu Ser Glu Glu Asn 
305 > 310 315 320 

GGA GAC CTT TCC TTT AAG CTG CGG GAG TTT GCC AGT CAT CTG CAG CAG 1008 
Gly Asp Leu Ser Phe Lys Leu Arg Glu Phe Ala Ser His Leu Gin Gin 
325 330 335 

CTA CAG GAT GCC CTC AAT GAG CTG ACG GAG GAG CAC AGC AAG GCC ACT 1056 
Leu Gin Asp Ala Leu Asn Glu Leu Thr Glu Glu His Ser Lys Ala Thr 
340 345 350 

CAG GAG TGG CTA GAG AAG CAG GCC CAG CTG GAG AAG GAG CTC AGC GCA 1104 
Gin Glu Trp Leu Glu Lys Gin Ala Gin Leu Glu Lys Glu Leu Ser Ala 
355 360 365 

GCC CTG CAG GAC AAG AAA TGC CTT GAA GAG AAG AAC GAA ATC CTT CAG 1152 
Ala Leu Gin Asp Lys Lys Cys Leu Glu Glu Lys Asn Glu He Leu Gin 
370 375 380 

GGA AAA CTT TCA CAG CTG GAA GAA CAC TTG TCC CAG CTG CAG GAT AAC 1200 
Gly Lys Leu Ser Gin Leu Glu Glu His Leu Ser Gin Leu Gin Asp Asn 
385 390 395 400 

CCA CCC CAG GAG AAG GGC GAG GTG CTG GGT GAT GTC TTG CAG CTG GAA 1248 
Pro Pro Gin Glu Lys Gly Glu Val Leu Gly Asp Val Leu Gin Leu Glu 
405 410 415 

ACC TTG AAG CAA GAG GCA GCC ACT CTT GCT GCA AAC AAC ACA CAG CTC 1296 
Thr Leu Lys Gin Glu Ala Ala Thr Leu Ala Ala Asn Asn Thr Gin Leu 
420 425 430 

CAA GCC AGG GTA GAG ATG CTG GAG ACT GAG CGA GGC CAG CAG GAA GCC 1344 
Gin Ala Arg Val Glu Met Leu Glu Thr Glu Arg Gly Gin Gin Glu Ala 

435 440 445 



AAG CTG CTT GCT GAG CGG GGC CAC TTC GAA GAA GAA AAG CAG CAG CTG 
Lys Leu Leu Ala Glu Arg Gly His Phe Glu Glu Glu Lys Gin Gin Leu 
450 455 460 
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TCT AGC CTG ATC ACT GAC CTG CAG AGC TCC ATC TCC AAC CTC AGC CAG 1440 
Ser Ser Leu He Thr Asp Leu Gin Ser Ser He Ser Asn Leu Ser Gin 
465 470 475 480 

GCC AAG GAA GAG CTG GAG CAG GCC TCC CAG GCT CAT GGG GCC CGG TTG 1488 
Ala Lys Glu Glu Leu Glu Gin Ala Ser Gin Ala His Gly Ala Arg Leu 
A85 490 495 

ACT GCC CAG GTG GCC TCT CTG ACC TCT GAG CTC ACC ACA CTC AAT GCC 1536 
Thr Ala Gin Val Ala Ser Leu Thr Ser Glu Leu Thr Thr Leu Asn Ala 
500 505 510 

ACC ATC CAG CAA CAG GAT CAA GAA CTG GCT GGC CTG AAG CAG CAG GCC 1584 
Thr He Gin Gin Gin Asp Gin Glu Leu Ala Gly Leu Lys Gin Gin Ala 
515 520 525 

AAA GAG AAG CAG GCC CAG CTA GCA CAG ACC CTC CAA CAG CAA GAA CAG 1632 
Lys Glu Lys Gin Ala Gin Leu Ala Gin Thr Leu Gin Gin Gin Glu Gin 
530 535 540 

GCC TCC CAG GGC CTC CGC CAC CAG GTG GAG CAG CTA AGC AGT AGC CTG 1680 
Ala Ser Gin Gly Leu Arg His Gin Val Glu Gin Leu Ser Ser Ser Leu 
545 550 555 560 

AAG CAG AAG GAG CAG CAG TTG AAG GAG GTA GCG GAG AAG CAG GAG GCA 1728 
Lys Gin Lys Glu Gin Gin Leu Lys Glu Val Ala Glu Lys Gin Glu Ala 
565 570 575 

ACT AGG CAG GAC CAT GCC CAG CAA CTG GCC ACT GCT GCA GAG GAG CGA 1776 
Thr Arg Gin Asp His Ala Gin Gin Leu Ala Thr Ala Ala Glu Glu Arg 
580 585 590 

GAG GCC TCC TTA AGG GAG CGG GAT GCG GCT CTC AAG CAG CTG GAG GCA 1824 
Glu Ala Ser Leu Arg Glu Arg Asp Ala Ala Leu Lys Gin Leu Glu Ala 
595 600 . 605 

CTG GAG AAG GAG AAG GCT GCC AAG CTG GAG ATT CTG CAG CAG CAA CTT 1872 
Leu Glu Lys Glu Lys Ala Ala Lys Leu Glu He Leu Gin Gin Gin Leu 
610 615 620 

CAG GTG GCT AAT GAA GCC CGG GAC AGT GCC CAG ACC TCA GTG ACA CAG 1920 
Gin Val Ala Asn Glu Ala Arg Asp Ser Ala Gin Thr Ser Val Thr Gin 
625 630 635 640 

GCC CAG CGG GAG AAG GCA GAG CTG AGC CGG AAG GTG GAG GAA CTC CAG 1968 
Ala Gin Arg Glu Lys Ala Glu Leu Ser Arg Lys Val Glu Glu Leu Gin 
645 650 655 

GCC TGT GTT GAG ACA GCC CGC CAG GAA CAG CAT GAG GCC CAG GCC CAG 2016 
Ala Cys Val Glu Thr Ala Arg Gin Glu Gin His Glu Ala Gin Ala Gin 
660 665 670 
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GTT GCA GAG CTA GAG TTG CAG CTG CGG TCT GAG CAG CAA AAA GCA ACT 2064 
Val Ala Glu Leu Glu Leu Gin Leu Arg Ser Glu Gin Gin Lys Ala Thr 
675 680 685 

GAG AAA GAA AGG GTG GCC CAG GAG AAG GAC CAG CTC CAG GAG CAG CTC 2112 
Glu Lys Glu Arg Val Ala Gin Glu Lys Asp Gin Leu Gin Glu Gin Leu 
690 695 700 

CAG GCC CTC AAA GAG TCC TTG AAG GTC ACC AAG GGC AGC CTT GAA GAG 2160 
Gin Ala Leu Lys Glu Ser Leu Lys Val Thr Lys Gly Ser Leu Glu Glu 
705 710 715 720 

GAG AAG CGC AGG GCT GCA GAT GCC CTG GAA GAG CAG CAG CGT TGT ATC 2208 
Glu Lys Arg Arg Ala Ala Asp Ala Leu Glu Glu Gin Gin Arg Cys He 
725 730 735 

TCT GAG CTG AAG GCA GAG ACC CGA AGC CTG GTG GAG CAG CAT AAG CGG 2256 
Ser Glu Leu Lys Ala Glu Thr Arg Ser Leu Val Glu Gin His Lys Arg 
740 745 750 

GAA CGA AAG GAG CTG GAA GAA GAG AGG GCT GGG CGC AAG GGG CTG GAG 2304 
Glu Arg Lys Glu Leu Glu Glu Glu Arg Ala Gly Arg Lys Gly Leu Glu 
755 760 765 

GCT CGA TTA CTG CAG CTT GGG GAG GCC CAT CAG GCT GAG ACT GAA GTC 2352 
Ala Arg Leu Leu Gin Leu Gly Glu Ala His Gin Ala Glu Thr Glu Val 
770 775 780 _ 

CTG CGG CGG GAG CTG GCA GAG GCC ATG GCT GCC CAG CAC ACA GCT GAG 2400 
Leu Arg Arg Glu Leu Ala Glu Ala Met Ala Ala Gin His Thr Ala Glu 
785 790 795 800 

AGT GAG TGT GAG CAG CTC GTC AAA GAA GTA GCT GCC TGG CGT GAC GGG 2448 
Ser Glu Cys Glu Gin Leu Val Lys Glu Val Ala Ala Trp Arg Asp Gly 
805 810 ^ 815 

TAT GAG GAT AGC CAG CAA GAG GAG GCA CAG TAT GGC GCC ATG TTC CAG 2496 
Tyr Glu Asp Ser Gin Gin Glu Glu Ala Gin Tyr Gly Ala Met Phe Gin 
820 825 830 

♦ 

GAA CAG CTG ATG ACT TTG AAG GAG GAA TGT GAG AAG GCC CGC CAG GAG 2544 
Glu Gin Leu Met Thr Leu Lys Glu Glu Cys Glu Lys Ala Arg Gin Glu 
835 840 845 

CTG CAG GAG GCA AAG GAG AAG GTG GCA GGC ATA GAA TCC CAC AGC GAG 2592 
Leu Gin Glu Ala Lys Glu Lys Val Ala Gly He Glu Ser His Ser Glu 
850 855 860 



CTC CAG ATA AGC CGG CAG CAG AAC AAA CTA GCT GAG CTC CAT GCC AAC 2640 
Leu Gin He Ser Arg Gin Gin Asn Lys Leu Ala Glu Leu His Ala Asn 
865 870 875 880 
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CTG GCC AGA GCA CTC CAG GAG GTC CAA GAG AAG GAA GTC AGG GCC GAG 2688 
Leu Ala Arg Ala Leu Gin Gin Val Gin Glu Lys Glu Val Arg Ala Gin 
885 890 895 

AAG CTT GCA GAT GAG CTC TCC ACT CTG CAG GAA AAG ATG GCT GCC ACC 2736 
Lys Leu Ala Asp Asp Leu Ser Thr Leu Gin Glu Lys Met Ala Ala Thr 
900 905 910 

AGC AAA GAG GTG GCC CGC TTG GAG ACC TTG GTC CGC AAG GCA GGT GAG 2784 
Ser Lys Glu Val Ala Arg Leu Glu Thr Leu Val Arg Lys Ala Gly Glu 
915 920 925 

CAG CAG GAA ACA GCC TCC CGG GAG TTA GTC AAG GAG CCT GCG AGG GCA 2832 
Gin Gin Glu Thr Ala Ser Arg Glu Leu Val Lys Glu Pro Ala Arg Ala 
930 935 940 

GGA GAC AGA CAG CCC GAG TGG CTG GAA GAG CAA CAG GGA CGC CAG TTC 2880 
Gly Asp Arg Gin Pro Glu Trp Leu Glu Glu Gin Gin Gly Arg Gin Phe 
945 950 955 960 

TGC AGC ACA CAG GCA GCG CTG CAG GCT ATG GAG CGG GAG GCA GAG CAG 2928 
Cys Ser Thr Gin Ala Ala Leu Gin Ala Met Glu Arg Glu Ala Glu Gin 
965 970 975 

ATG GGC AAT GAG CTG GAA CGG CTG CGG GCC GCG CTG ATG GAG AGC CAG 2976 
Met Gly Asn Glu Leu Glu Arg Leu Arg Ala Ala Leu Met Glu Ser Gin 
980 985 990 

GGG CAG CAG CAG GAG GAG CGT GGG CAG CAG GAA AGG GAG GTG GCG CGG 3024 
Gly Gin Gin Gin Glu Glu Arg Gly Gin Gin Glu Arg Glu Val Ala Arg 
995 1000 1005 

CTG ACC CAG GAG CGG GGC CGT GC'C CAG GCT GAC CTT GCC CTG GAG AAG 3072 
Leu Thr Gin Glu Arg Gly Arg Ala Gin Ala Asp Leu Ala Leu Glu Lys 
1010 1015 1020 

GCG GCC AGA GCA GAG CTT GAG ATG CGG CTG CAG AAC GCC CTC AAC GAG 3120 
Ala Ala Arg Ala Glu Leu Glu Met Arg Leu Gin Asn Ala Leu Asn Glu 
1025 1030 1035 1040 

CAG CGT GTG GAG TTC GCT ACC CTG CAA GAG GCA CTG GCT CAT GCC CTG 3168 
Gin Arg Val Glu Phe Ala Thr Leu Gin Glu Ala Leu Ala His Ala Leu 
1045 1050 1055 

ACG GAA AAG GAA GGC AAG GAC CAG GAG TTG GCC AAG CTT CGT GGT CTG 3216 
Thr Glu Lys Glu Gly Lys Asp Gin Glu Leu Ala Lys Leu Arg Gly Leu 
1060 1065 1070 

GAG GCA GCC CAG ATA AAA GAG CTG GAG GAA CTT CGG CAA ACC GTG AAG 3264 
Glu Ala Ala Gin He Lys Glu Leu Glu Glu Leu Arg Gin Thr Val Lys 
1075 1080 1085 
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CAA CTG AAG GAA CAG CTG GCT AAG AAA GAA AAG GAG CAC GCA TCT GGC 3312 
Gin Leu Lys Glu Gin Leu Ala Lys Lys Glu Lys Glu His Ala Ser Gly 
1090 1095 1100 

TCA GGA GCC CAA TCT GAG GCT GCT GGC AGG ACA GAG CCA ACA GGC CCC 3360 
Ser Gly Ala Gin Ser Glu Ala Ala Gly Arg Thr Glu Pro Thr Gly Pro 
1105 1110 1115 1120 

AAG CTG GAA GCA CTG CGG GCA GAG GTG AGC AAG CTG GAA CAG CAA TGC 3408 
Lys Leu Glu Ala Leu Arg Ala Glu Val Ser Lys Leu Glu Gin Gin Cys 
1125 1130 1135 

CAG AAG CAG CAG GAG CAG GCT GAC AGC CTG GAA CGC AGC CTC GAG GCT 3456 
Gin Lys Gin Gin Glu Gin Ala Asp Ser Leu Glu Arg Ser Leu Glu Ala 
1140 1145 1150 

GAG CGG GCC TCC CGG GCT GAG CGG GAC AGT GCT CTG GAG ACT CTG CAG 3504 
Glu Arg Ala Ser Arg Ala Glu Arg Asp Ser Ala Leu Glu Thr Leu Gin 
1155 1160 1165 

GGC CAG TTA GAG GAG AAG GCC CAG GAG CTA GGG CAC AGT CAG AGT GCC 3552 
Gly Gin Leu Glu Glu Lys Ala Gin Glu Leu Gly His Ser Gin Ser Ala 
1170 1175 1180 

TTA GCC TCG GCC CAA CGG GAG TTG GCT GCC TTC CGC ACC AAG GTA CAA 3600 
Leu Ala Ser Ala Gin Arg Glu Leu Ala Ala Phe Arg Thr Lys Val Gin 
1185 1190 1195 1200 



GAC CAC AGC AAG GCT GAA GAT GAG TGG AAG GCC CAG GTG GCC CGG GGC 
Asp His Ser Lys Ala Glu Asp Glu Trp Lys Ala Gin Val Ala Arg Gly 
1205 1210 1215 



3648 



CGG CAA GAG GCT GAG AGG AAA AAT AGC CTC ATC AGC AGC TTG GAG GAG 3696 
Arg Gin Glu Ala Glu Arg Lys Asn Ser Leu He Ser Ser Leu Glu Glu 
1220 1225 1230 

GAG GTG TCC ATC CTG AAT CGC CAG GTC CTG GAG AAG GAG GGG GAG AGC 3744 
Glu Val Ser He Leu Asn Arg Gin Val Leu Glu Lys Glu Gly Glu Ser 
1235 1240 1245 

AAG GAG TTG AAG CGG CTG GTG ATG GCC GAG TCA GAG AAG AGC CAG AAG 3792 
Lys Glu Leu Lys Arg Leu Val Met Ala Glu Ser Glu Lys Ser Gin Lys 
1250 1255 1260 

CTG GAG GAG AGC TGC GCC TGC TGC AGG CAG AGA CAG CCA GCA ACA GTG 3840 
Leu Glu Glu Ser Cys Ala Cys Cys Arg Gin Arg Gin Pro Ala Thr Val 
1265 1270 1275 1280 



CCA GAG CTG CAG AAC GCA GCT CTG CTC TGC GGG AGG AGG TGC AGA GCC 
Pro Glu Leu Gin Asn Ala Ala Leu Leu Cys Gly Arg Arg Cys Arg Ala 
1285 1290 1295 



3888 
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TCC GGG AGG GAG GCT GAG AAA CAG CGG GTG GCT TCA GAG AAC CTG CGG 3936 
Ser Gly Are Glu Ala Glu Lys Gin Arg Val Ala Ser Glu Asn Leu Arg 
1300 1305 1310 

CAG GAG CTG ACC TCA CAG GCT GAG CGT GCG GAG GAG CTG GGC CAA GAA 3984 
Gin Glu Leu Thr Ser Gin Ala Glu Arg Ala Glu Glu Leu Gly Gin Glu 
1315 1320 1325 

TTG AAG GCG TGG CAG GAG AAG TTC TIC CAG AAA GAG CAG GCC CTC TCC 4032 
Leu Lys Ala Trp Gin Glu Lys Phe Phe Gin Lys Glu Gin Ala Leu Ser 
1330 1335 1340 

ACC CTG CAG CTC GAG CAC ACC AGC ACA CAG GCC CTG GTG AGT GAG CTG 4080 
Thr Leu Gin Leu Glu His Thr Ser Thr Gin Ala Leu Val Ser Glu Leu 
1345 1350 1355 1360 

CTG CCA GCT AAG CAC CTC TGC CAG CAG CTG CAG GCC GAG CAG GCC GCT 4128 
Leu Pro Ala Lys His Leu Cys Gin Gin Leu Gin Ala Glu Gin Ala Ala 
1365 1370 1375 

GCC GAG AAA CGC CAC CGT GAG GAG CTG GAG CAG AGC AAG CAG GCC GCT 4176 
Ala Glu Lys Arg His Arg Glu Glu Leu Glu Gin Ser Lys Gin Ala Ala 
1380 1385 1390 

GGG GGA CTG CGG GCA GAG CTG CTG CGG GCC CAG CGG GAG CTT GGG GAG 4224 
Gly Gly Leu Arg Ala Glu Leu Leu Arg Ala Gin Arg Glu Leu Gly Glu 
1395 1400 1405 

CTG ATT CCT CTG CGG CAG AAG GTG GCA GAG CAG GAG CGA ACA GCT CAG 4272 
Leu He Pro Leu Arg Gin Lys Val Ala Glu Gin Glu Arg Thr Ala Gin 
1410 1415 1420 

CAG CTG CGG GCA GAG AAG GCC AGC TAT GCA GAG CAG CTG AGC ATG CTG 4320 
Gin Leu Arg Ala Glu Lys Ala Ser Tyr Ala Glu Gin Leu Ser Met Leu 
1425 1430 1435 1440 

AAG AAG GCG CAT GGC CTG CTG GCA GAG GAG AAC CGG GGG CTG GGT GAG 4368 
Lys Lys Ala His Gly Leu Leu Ala Glu Glu Asn Arg Gly Leu Gly Glu 
1445 1450 1455 

CGG GCC AAC CTT GGC CGG CAG TTT CTG GAA GTG GAG TTG GAC CAG GCC 4416 
Arg Ala Asn Leu Gly Arg Gin Phe Leu Glu Val Glu Leu Asp Gin Ala 
1460 1465 1470 

CGG GAA AAG TAT GTC CAA GAG TTG GCA GCC GTA CGT GCT GAT GCT GAG 4464 
Arg Glu Lys Tyr Val Gin Glu Leu Ala Ala Val Arg Ala Asp Ala Glu 
-. 1475 1480 1485 

ACC CGT CTG GCT GAG GTG CAG CGA GAA GCA CAG AGC ACT GCC CGG GAG 4512 
Thr Arg Leu Ala Glu Val Gin Arg Glu Ala Gin Ser Thr Ala Arg Glu 
1490 1495 1500 
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CTG GAG GTG ATG ACT GCC AAG TAT GAG GGT GCC AAG GTC AAG GTC CTG 4560 
Leu Glu Val Met Thr Ala Lys Tyr Glu Gly Ala Lys Val Lys Val Leu 
1505 1510 1515 1520 

GAG GAG AGG CAG CGG TTC CAG GAA GAG AGG CAG AAA CTG ACT GCC CAG 4608 
Glu Glu Arg Gin Arg Phe Gin Glu Glu Arg Gin Lys Leu Thr Ala Gin 
1525 1530 1535 

GTG GAA GAA CTG AGT AAG AAA CTG GCT GAC TCT GAC CAA GCC AGC AAG 4656 
Val Glu Glu Leu Ser Lys Lys Leu Ala Asp Ser Asp Gin Ala Ser Lys 
1540 1545 1550 

GTG CAG CAG CAG AAG CTG AAG GCT GTC CAG GCT CAG GGA GGC GAG AGC 4704 
Val Gin Gin Gin Lys Leu Lys Ala Val Gin Ala Gin Gly Gly Glu Ser 
155.^ 1560 1565 

CAG CAG GAG GCC CAG CGC TTC CAG GCC CAG CTG AAT GAA CTG CAA GCC 4752 
Gin Gin Glu Ala Gin Arg Phe Gin Ala Gin Leu Asn Glu Leu Gin Ala 
1570 1575 1580 

CAG TTG AGC CAG AAG GAG CAG GCA GCT GAG CAC TAT AAG CTG CAG ATG 4800 
Gin Leu Ser Gin Lys Glu Gin Ala Ala Glu His Tyr Lys Leu Gin Met 
1585 1590 1595 1600 

GAG AAA GCC AAA ACA CAT TAT GAT GCC AAG AAG CAG CAG AAC CAA GAG 4848 
Glu Lys Ala Lys Thr His Tyr Asp Ala Lys Lys Gin Gin Asn Gin Glu 
1605 1610 1615 

CTG CAG GAG CAG CTG CGG AGC CTG GAG CAG CTG CAG AAG GAA AAC AAA 4896 
Leu Gin Glu Gin Leu Arg Ser Leu Glu Gin Leu Gin Lys Glu Asn Lys 
1620 1625 1630 

GAG CTG CGA GCT GAA GCT GAA CGG CTG GGC CAT GAG CTA CAG CAG GCT 4944 
Glu Leu Arg Ala Glu Ala Glu Arg Leu Gly His Glu Leu Gin Gin Ala 
1635 1640 1645 

GGG CTG AAG ACC AAG GAG GCT GAA CAG ACC TGC CGC CAC CTT ACT GCC 4992 
Gly Leu Lys Thr Lys Glu Ala Glu Gin Thr Cys Arg His Leu Thr Ala 
1650 1655 1660 

CAG GTG CGC AGC CTG GAG GCA CAG GTT GCC CAT GCA GAC CAG CAG CTT 5040 
Gin Val Arg Ser Leu Glu Ala Gin Val Ala His Ala Asp Gin Gin Leu 
1665 1670 1675 1680 

CGA GAC CTG GGC AAA TTC CAG GTG GCA ACT GAT GCT TTA AAG AGC CGT 5088 
Arg Asp Leu Gly Lys Phe Gin Val Ala Thr Asp Ala Leu Lys Ser Arg 
1685 1690 1695 

GAG CCC CAG GCT AAG CCC CAG CTG GAC TTG AGT ATT GAC AGC CTG GAT 5136 
Glu Pro Gin Ala Lys Pro Gin Leu Asp Leu Ser He Asp Ser Leu Asp 
1700 1705 1710 
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CTG AGC TGC GAG GAG GGG ACC CCA CTC AGT ATC ACC AGC AAG CTG CCT 5184 
Leu Ser Cys Glu Glu Gly Thr Pro Leu Ser He Thr Ser Lys Leu Pro 
1715 1720 1725 

CGT ACC CAG CCA GAC GGC ACC AGC GTC CCT GGA GAA CCA GCC TCA CCT 5232 
Are Thr Gin Pro Asp Gly Thr Ser Val Pro Gly Glu Pro Ala Ser Pro 
1730 1735 1740 

ATC TCC CAG CGC CTG CCC CCC AAG GTA GAA TCC CTG GAG AGT CTC TAC 5280 
He Ser Gin Arg Leu Pro Pro Lys Val Glu Ser Leu Glu Ser Leu Tyr 
1745 1750 1755 1760 

TTC ACT CCC ATC CCT GCT CGG AGT CAG GCC CCC CTG GAG AGC AGC CTG 5328 
Phe Thr Pro He Pro Ala Arg Ser Gin Ala Pro Leu Glu Ser Ser Leu 
1765 1770 1775 

GAC TCC CTG GGA GAC GTC TTC CTG GAC TCG GGT CGT AAG ACC CGC TCC 5376 
Asp Ser Leu Gly Asp Val Phe Leu Asp Ser Gly Arg Lys Thr Arg Ser 
1780 1785 1790 

GCT CGT CGG CGC ACC ACG CAG ATC ATC AAC ATC ACC ATG ACC AAG AAG 5424 
Ala Arg Arg Arg Thr Thr Gin He He Asn He Thr Met Thr Lys Lys 
1795 1800 1805 

CTA GAT GTG GAA GAG CCA GAC AGC GCC AAC TCA TCG TTC TAC AGC ACG 5472 
Leu Asp Val Glu Glu Pro Asp Ser Ala Asn Ser Ser Phe Tyr Ser Thr 
1810 1815 1820 

CGG TCT GCT CCT GCT TCC CAG GCT AGC CTG CGA GCC ACC TCC TCT ACT 5520 
Are Ser Ala Pro Ala Ser Gin Ala Ser Leu Arg Ala Thr Ser Ser Thr 
1825 1830 1835 1840 

CAG TCT CTA GCT CGC CTG GGT TCT CCC GAT TAT GGC AAC TCA GCC CTG 5568 
Gin Ser Leu Ala Arg Leu Gly Ser Pro Asp Tyr Gly Asn Ser Ala Leu 
1845 1850 1855 

CTC AGC TTG CCT GGC TAC CGC CCC ACC ACT CGC AGT TCT GCT CGT CGT 5616 
Leu Ser Leu Pro Gly Tyr Arg Pro Thr Thr Arg Ser Ser Ala Arg Arg 
1860 1865 1870 

TCC CAG GCC GGG GTG TCC AGT GGG GCC CCT CCA GGA AGG AAC AGC TTC 5664 
Ser Gin Ala Gly Val Ser Ser Gly Ala Pro Pro Gly Arg Asn Ser Phe 
1875 1880 1885 

TAC ATG GGC ACT TGC CAG GAT GAG CCT GAG CAG CTG GAT GAC TGG AAC 5712 
Tyr Met Gly Thr Cys Gin Asp Glu Pro Glu Gin Leu Asp Asp Trp Asn 
1890 1895 1900 

5760 



CGC ATT GCA GAG CTG CAG CAG CGC AAT CGA GTG TGC CCC CCA CAT CTG 
Arg He Ala Glu Leu Gin Gin Arg Asn Arg Val Cys Pro Pro His Leu 
1905 1910 1915 



1920 
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AAG ACC TGC TAT CCC CTG GAG TCC AGG CCT TCC CTG AGC CTG GGC ACC 5808 
Lvs Thr Cys Tyr Pro Leu Glu Ser Arg Pro Ser Leu Ser Leu Gly Thr 
' 1925 1930 1935 

ATC ACA GAT GAG GAG ATG AAA ACT GGA GAC CCC CAA GAG ACC CTG CGC 5856 
He Thr Asp Glu Glu Met Lys Thr Gly Asp Pro Gin Glu Thr Leu Arg 
1940 1945 1950 

CGA GCC AGC ATG CAG CCA ATC CAG ATA GCC GAG GGC ACT GGC ATC ACC 5904 
Arg Ala Ser Met Gin Pro He Gin He Ala Glu Gly Thr Gly He Thr 
1955 I960 1965 

ACC CGG CAG CAG CGC AAA CGG GTC TCC CTA GAG CCC CAC CAG GGC CCT 5952 
Thr Arg Gin Gin Arg Lys Arg Val Ser Leu Glu Pro His Gin Gly Pro 
1970 1975 1980 

GGA ACT CCT GAG TCT AAG AAG GCC ACC AGC TGT TTC CCA CGC CCC ATG 6000 
Gly Thr Pro Glu Ser Lys Lys Ala Thr Ser Cys Phe Pro Arg Pro Met 
1985 1990 1995 2000 

ACT CCC CGA GAC CGA CAT GAA GGG CGC AAA CAG AGC ACT ACT GAG GCC 6048 
Thr Pro Arg Asp Arg His Glu Gly Arg Lys Gin Ser Thr Thr Glu Ala 
2005 2010 2015 

CAG AAG AAA GCA GCT CCA GCT TCT ACT AAA CAG GCT GAC CGG CGC CAG 6096 
Gin Lys Lys Ala Ala Pro Ala Ser Thr Lys Gin Ala Asp Arg Arg Gin 
2020 2025 2030 

TCG ATG GCC TTC AGC ATC CTC AAC ACA CCC AAG AAG CTA GGG AAC AGC 6144 
Ser Met Ala Phe Ser He Leu Asn Thr Pro Lys Lys Leu Gly Asn Ser 
2035 2040 2045 

CTT CTG CGG CGG GGA GCC TCA AAG AAG GCC CTG TCC AAG GCT TCC CCC 6192 
Leu Leu Arg Arg Gly Ala Ser Lys Lys Ala Leu Ser Lys Ala Ser Pro 
2050 2055 2060 

AAC ACT CGC AGT GGA ACC CGC CGT TCT CCG CGC ATT GCC ACC ACC ACA 6240 
Asn Thr Arg Ser Gly Thr Arg Arg Ser Pro Arg He Ala Thr Thr Thr 
2065 2070 2075 2080 

GCC AGT GCC GCC ACT GCT GCC GCC ATT GGT GCC ACC CCT CGA GCC AAG 6288 
Ala Ser Ala Ala Thr Ala Ala Ala He Gly Ala Thr Pro Arg Ala Lys 
2085 2090 2095 

GGC AAG GCA AAG CAC TAA 6306 
Gly Lys Ala Lys His 

2100 ._ 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2101 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Thr Leu His Ala Thr Arg Gly Ala Ala Leu Leu Ser Trp Val Asn 
15 10 15 

Ser Leu His Val Ala Asp Pro Val Glu Ala Val Leu Gin Leu Gin Asp 
20 25 30 

Cys Ser He Phe He Lys He He Asp Arg He His Gly Thr Glu Glu 
35 40 45 

Gly Gin Gin He Leu Lys Gin Pro Val Ser Glu Arg Leu Asp Phe Val 
50 55 60 

Cys Ser Phe Leu Gin Lys Asn Arg Lys His Pro Ser Ser Pro Glu Cys 
65 70 75 80 

Leu Val Ser Ala Gin Lys Val Leu Glu Gly Ser Glu Leu Glu Leu Ala 
85 90 95 

Lys Met Thr Met Leu Leu Leu Tyr His Ser Thr Met Ser Ser Lys Ser 
100 105 110 

Pro Arg Asp Trp Glu Gin Phe Glu Tyr Lys He Gin Ala Glu Leu Ala 
115 120 125 

Val He Leu Lys Phe Val Leu Asp His Glu Asp Gly Leu Asn Leu Asn 
130 135 140 

Glu Asp Leu Glu Asn Phe Leu Gin Lys Ala Pro Val Pro Ser Thr Cys 
145 150 155 160 

Ser Ser Thr Phe Pro Glu Glu Leu Ser Pro Pro Ser His Gin Ala Lys 
165 170 175 

Arg Glu He Arg Phe Leu Glu Leu Gin Lys Val Ala Ser Ser Ser Ser 
180 185 190 

Gly Asn Asn Phe Leu Ser Gly Ser Pro Ala Ser Pro Met Gly Asp He 
195 200 205 

Leu Gin Thr Pro Gin Phe Gin Met Arg Arg Leu Lys Lys Gin Leu Ala 
210 215 220 
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Asp Glu Are Ser Asn Arg Asp Glu Leu Glu Leu Glu Leu Ala Glu Asn 
225 230 235 240 

Arc Lys Leu Leu Thr Glu Lys Asp Ala Gin He Ala Met Met Gin Gin 
^ 245 250 255 

Arg He Asp Arg Leu Ala Leu Leu Asn Glu Lys Gin Ala Ala Ser Pro 
260 265 270 

Leu Glu Pro Lys Glu Leu Glu Glu Leu Arg Asp Lys Asn Glu Ser Leu 
275 280 285 

Thr Met Arg Leu His Glu Thr Leu Lys Gin Cys Gin Asp Leu Lys Thr 
290 295 300 

Glu Lys Ser Gin Met Asp Arg Lys He Asn Glh Leu Ser Glu Glu Asn 
305 310 315 320 

Gly Asp Leu Ser Phe Lys Leu Arg Glu Phe Ala Ser His Leu Gin Gin 
325 330 335 

Leu Gin Asp Ala Leu Asn Glu Leu Thr Glu Glu His Ser Lys Ala Thr 
340 345 350 

Gin Glu Trp Leu Glu Lys Gin Ala Gin Leu Glu Lys Glu Leu Ser Ala 
355 360 365 

Ala Leu Gin Asp Lys Lys Cys Leu Glu Glu Lys Asn Glu He Leu Gin 
370 375 380 

Gly Lys Leu Ser Gin Leu Glu Glu His Leu Ser Gin Leu Gin Asp Asn 
385 390 395 400 

Pro Pro Gin Glu Lys Gly Glu Val Leu Gly Asp Val Leu Gin Leu Glu 
405 410 415 

Thr Leu Lys Gin Glu Ala Ala Thr Leu Ala Ala Asn Asn Thr Gin Leu 
420 425 430 

Gin Ala Arg Val Glu Met Leu Glu Thr Glu Arg Gly Gin Gin Glu Ala 
435 440 445 

Lys Leu Leu Ala Glu Arg Gly His Phe Glu Glu Glu Lys Gin Gin Leu 
450 455 460 

Ser Ser Leu He Thr Asp Leu Gin Ser Ser He Ser Asn Leu Ser Gin 

465 470 475 480, 



Ala Lys Glu Glu Leu Glu Gin Ala Ser Gin Ala His Gly Ala Arg Leu 
485 490 495 
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Thr Ala Gin Val Ala Ser Leu Thr Ser Glu Leu Thr Thr Leu Asn Ala 
500 505 510 

Thr He Gin Gin Gin Asp Gin Glu Leu Ala Gly Leu Lys Gin Gin Ala 
515 520 525 

Lys Glu Lys Gin Ala Gin Leu Ala Gin Thr Leu Gin Gin Gin Glu Gin 
530 535 540 

Ala Ser Gin Gly Leu Arg His Gin Val Glu Gin Leu Ser Ser Ser Leu 
545 550 555 560 

Lys Gin Lys Glu Gin Gin Leu Lys Glu Val Ala Glu Lys Gin Glu Ala 
565 570 575 

Thr Arg Gin Asp His Ala Gin Gin Leu Ala Thr Ala Ala Glu Glu Arg 
580 585 590 

Glu Ala Ser Leu Arg Glu Arg Asp Ala Ala Leu Lys Gin Leu Glu Ala 
595 600 605 

Leu Glu Lys Glu Lys Ala Ala Lys Leu Glu He Leu Gin Gin Gin Leu 
610 615 620 

Gin Val Ala Asn Glu Ala Arg Asp Ser Ala Gin Thr Ser Val Thr Gin 
625 630 635 640 

Ala Gin Arg Glu Lys Ala Glu Leu Ser Arg Lys Val Glu Glu Leu Gin 
645 650 655 

Ala Cys Val Glu Thr Ala Arg Gin Glu Gin His Glu Ala Gin Ala Gin 
660 665 670 

Val Ala Glu Leu Glu Leu Gin Leu Arg Ser Glu Gin Gin Lys Ala Thr 
675 680 685 

Glu Lys Glu Arg Val Ala Gin Glu Lys Asp Gin Leu Gin Glu Gin Leu 
690 695 700 

Gin Ala Leu Lys Glu Ser Leu Lys Val Thr Lys Gly Ser Leu Glu Glu 
705 710 715 720 

Glu Lys Arg Arg Ala Ala Asp Ala Leu Glu Glu Gin Gin Arg Cys He 
725 730 735 

Ser Glu Leu Lys Ala Glu Thr Arg Ser Leu Val Glu Gin His Lys Arg 
• 740 745 750 

Glu Arg Lys Glu Leu Glu Glu Glu Arg Ala Gly Arg Lys Gly Leu Glu 
. 755 760 765 
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Ala Arg Leu Leu Gin Leu Gly Glu Ala His Gin Ala Glu Thr Glu Val 
770 775 780 

Leu Arg Arg Glu Leu Ala Glu Ala Het Ala Ala Gin His Thr Ala Glu 
785 790 795 800 

Ser Glu Cys Glu Gin Leu Val Lys Glu Val Ala Ala Trp Arg Asp Gly 
805 810 815 

Tyr Glu Asp Ser Gin Gin Glu Glu Ala Gin Tyr Gly Ala Met ?he Gin 
820 825 830 

Glu Gin Leu Met Thr Leu Lys Glu Glu Cys Glu Lys Ala Arg Gin Glu 
835 840 845 

Leu Gin Glu Ala Lys Glu Lys Val Ala Gly lie Glu Ser His Ser Glu 
850 855 860 

Leu Gin lie Ser Arg Gin Gin Asn Lys Leu Ala Glu Leu His Ala Asn 
865 870 875 880 

Leu Ala Arg Ala Leu Gin Gin Val Gin Glu Lys Glu Val Arg Ala Gin 
885 890 895 

Lys Leu Ala Asp Asp Leu Ser Thr Leu Gin Glu Lys Met Ala Ala Thr 
900 905 910 

Ser Lys Glu Val Ala Arg Leu Glu Thr Leu Val Arg Lys Ala Gly Glu 
915 920 925 

Gin Gin Glu Thr Ala Ser Arg Glu Leu Val Lys Glu Pro Ala Arg Ala 
930 935 940 

Gly Asp Arg Gin Pro Glu Trp Leu Glu Glu Gin Gin Gly Arg Gin Phe 
945 950 955 960 

Cys Ser Thr Gin Ala Ala Leu Gin Ala Met Glu Arg Glu Ala Glu Gin 
965 970 975 

Met Gly Asn Glu Leu Glu Arg Leu Arg Ala Ala Leu Met Glu Ser Gin 
980 985 990 

Gly Gin Gin Gin Glu Glu Arg Gly Gin Gin Glu Arg Glu Val Ala Arg 
995 1000 1005 



Leu Thr Gin Glu Arg Gly Arg Ala Gin Ala Asp Leu Ala Leu Glu Lys 
1010 1015 1020 



Ala Ala Arg Ala Glu Leu Glu Het Arg Leu Gin Asn Ala Leu Asn Glu 
1025 1030 1035 1040 
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Gin Arg Val Glu Phe Ala Thr Leu Gin Glu Ala Leu Ala His Ala Leu 
10A5 1050 1055 

Thr Glu Lys Glu Gly Lys Asp Gin Glu Leu Ala Lys Leu Arg Gly Leu 
1060 1065 1070 

Glu Ala Ala Gin He Lys Glu Leu Glu Glu Leu Arg Gin Thr Val Lys 
1075 1080 1085 

Gin Leu Lys Glu Gin Leu Ala Lys Lys Glu Lys Glu His Ala Ser Gly 
1090 1095 1100 

Ser Gly Ala Gin Ser Glu Ala Ala Gly Arg Thr Glu Pro Thr Gly Pro 
1105 1110 1115 1120 

Lys Leu Glu Ala Leu Arg Ala Glu Val Ser Lvs Leu Glu Gin Gin Cys 
1125 1130 1135 

Gin Lys Gin Gin Glu Gin Ala Asp Ser Leu Glu Arg Ser Leu Glu Ala 
1140 11A5 1150 

Glu Arg Ala Ser Arg Ala Glu Arg Asp Ser Ala Leu Glu Thr Leu Gin 
1155 1160 1165 

Gly Gin Leu Glu Glu Lys Ala Gin Glu Leu Gly His Ser Gin Ser Ala 
1170 1175 1180 

Leu Ala Ser Ala Gin Arg Glu Leu Ala Ala Phe Arg Thr Lys Val Gin 
1185 1190 1195 1200 

Asp His Ser Lys Ala Glu Asp Glu Trp Lys Ala Gin Val Ala Arg Gly 
1205 1210 1215 

Arg Gin Glu Ala Glu Arg Lys Asn Ser Leu He Ser Ser Leu Glu Glu 
1220 1225 1230 

Glu Val Ser He Leu Asn Arg Gin Val Leu Glu Lys Glu Gly Glu Ser 
1235 1240 1245^ 

Lys Glu Leu Lys Arg Leu Val Met Ala Glu Ser Glu Lys Ser Gin Lys 
1250 1255 1260 

Leu Glu Glu Ser Cys Ala Cys Cys Arg Gin Arg Gin Pro Ala Thr Val 
1265 1270 1275 1280 

Pro Glu Leu Gin Asn Ala Ala Leu Leu Cys Gly Arg Arg Cys Arg Ala 
1285 1290 1295 

Ser Gly Arg Glu Ala Glu Lys Gin Arg Val Ala Ser Glu Asn Leu Arg 
1300 1305 1310 
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Gin Glu Leu Thr Ser Gin Ala Glu Arg Ala Glu Glu Leu Gly Gin Glu 
1315 1320 1325 

Leu Lys Ala Trp Gin Glu Lys Phe Phe Gin Lys Glu Gin Ala Leu Ser 
1330 1335 1340 

Thr Leu Gin Leu Glu His Thr Set Thr Gin Ala Leu Val Ser Glu Leu 
1345 1350 1355 1360 

Leu Pro Ala Lys His Leu Cys Gin Gin Leu Gin Ala Glu Gin Ala Ala 
1365 1370 1375 

Ala Glu Lys Arg His Arg Glu Glu Leu Glu Gin Ser Lys Gin Ala Ala 
1380 1385 1390 

Gly Gly Leu Arg Ala Glu Leu Leu Arg Ala Gin Arg Glu Leu Gly Glu 
1395 1400 1405 

Leu He Pro Leu Arg Gin Lys Val Ala Glu Gin Glu Arg Thr Ala Gin 
1410 1415 1420 

Gin Leu Arg Ala Glu Lys Ala Ser Tyr Ala Glu Gin Leu Ser Met Leu 
1425 1430 1435 1440 

Lys Lys Ala His Gly Leu Leu Ala Glu Glu Asn Arg Gly Leu Gly Glu 
1445 1450 1455 

Arg Ala Asn Leu Gly Arg Gin Phe Leu Glu Val Glu Leu Asp Gin Ala 
1460 1465 1470 

Arg Glu Lys Tyr Val Gin Glu Leu Ala Ala Val Arg Ala Asp Ala Glu 
1475 1480 1485 

Thr Arg Leu Ala Glu Val Gin Arg Glu Ala Gin Ser Thr Ala Arg Glu 
1490 1495 1500 

Leu Glu Val Met Thr Ala Lys Tyr Glu Gly Ala Lys Val Lys Val Leu 
1505 1510 1515 1520 

Glu Glu Arg Gin Arg Phe Gin Glu Glu Arg Gin Lys Leu Thr Ala Gin 
1525 1530 1535 

Val Glu Glu Leu Ser Lys Lys Leu Ala Asp Ser Asp Gin Ala Ser Lys 
1540 1545 1550 

Val Gin Gin Gin Lys Leu Lys Ala Val Gin Ala Gin Gly Gly Glu Ser 



Gin Gin Glu Ala Gin Arg Phe Gin Ala Gin Leu Asn Glu Leu Gin Ala 



1555 



1560 



1565 



1570 



1575 



1580 
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Gin Leu Ser Gin Lys Glu Gin Ala Ala Glu His Tyr Lys Leu Gin Met 
1585 1590 1595 1600 

Glu Lys Ala Lys Thr His Tyr Asp Ala Lys Lys Gin Gin Asn Gin Glu 
1605 1610 1615 

Leu Gin Glu Gin Leu Arg Ser Leu Glu Gin Leu Gin Lys Glu Asn Lys 
1620 1625 1630 

Glu Leu Arg Ala Glu Ala Glu Arg Leu Gly His Glu Leu Gin Gin Ala 
1635 1640 1645 

Gly Leu Lys Thr Lys Glu Ala Glu Gin Thr Cys Arg His Leu Thr Ala 
1650 1655 1660 

Gin Val Arg Ser Leu Glu Ala Gin Val Ala His Ala Asp Gin Gin Leu 
1665 1670 1675 1680 

Arg Asp Leu Gly Lys Phe Gin Val Ala Thr Asp Ala Leu Lys Ser Arg 
1685 1690 1695 

Glu Fro Gin Ala Lys Pro Gin Leu Asp Leu Ser lie Asp Ser Leu Asp 
1700 1705 1710 

Leu Ser Cys Glu Glu Gly Thr Pro Leu Ser lie Thr Ser Lys Leu Pro 
1715 1720 1725 

Arg Thr Gin Pro Asp Gly Thr Ser Val Pro Gly Glu Pro Ala Ser Pro 
1730 1735 1740 

He Ser Gin Arg Leu Pro Pro Lys Val Glu Ser Leu Glu Ser Leu Tyr 
1745 1750 . 1755 1760 

Phe Thr Pro He Pro Ala Arg Ser Gin Ala Pro Leu Glu Ser Ser Leu 
1765 1770 1775 

Asp Ser Leu Gly Asp Val Phe Leu Asp Ser Gly Arg Lys Thr Arg Ser 
1780 1785 1790 

Ala Arg Arg Arg Thr Thr Gin He He Asn He Thr Met Thr Lys Lys 
1795 1800 1805 

Leu Asp Val Glu Glu Fro Asp Ser Ala Asn Ser Ser Phe Tyr Ser Thr 
1810 1815 1820 

Arg Ser Ala Pro Ala Ser Gin Ala Ser Leu Arg Ala Thr Ser Ser Thr 
1825 1830 1835 1840 



Gin Ser Leu Ala Arg Leu Gly Ser Pro Asp Tyr Gly Asn Ser Ala Leu 
1845 1850 1855 
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Leu Ser Leu Pro Gly Tyr Arg Pro Thr Thr Arg Ser Ser Ala Arg Arg 
I860 1865 1870 

Ser Gin Ala Gly Val Ser Ser Gly Ala Pro Pro Gly Arg Asn Ser Phe 
1875 1880 188S 

Tyr Met Gly Thr Cys Gin Asp Glu Pro Glu Gin Leu Asp Asp Trp Asn 
1890 1895 1900 

Arg He Ala Glu Leu Gin Gin Arg Asn Arg Val Cys Pro Pro His Leu 
1905 1910 1915 1920 

Lys Thr Cys Tyr Pro Leu Glu Ser Arg Pro Ser Leu Ser Leu Gly Thr 
1925 1930 1935 

He Thr Asp Glu Glu Met Lys Thr Gly Asp Pro Gin Glu Thr Leu Arg 
19A0 1945 1950 

Arg Ala Ser Met Gin Pro He Gin He Ala Glu Gly Thr Gly He Thr 
1955 1960 1965 

Thr Arg Gin Gin Arg Lys Arg Val Ser Leu Glu Pro His Gin Gly Pro 
1970 1975 1980 

Gly Thr Pro Glu Ser Lys Lys Ala Thr Ser Cys Phe Pro Arg Pro Met 
1985 1990 1995 2000 

Thr Pro Arg Asp Arg His Glu Gly Arg Lys Gin Ser Thr Thr Glu Ala 
2005 2010 2015 

Gin Lys Lys Ala Ala Pro Ala Ser Thr Lys Gin Ala Asp Arg Arg Gin 
2020 2025 2030 

Ser Met Ala Phe Ser He Leu Asn Thr Pro Lys Lys Leu Gly Asn Ser 
2035 2040 2045 

Leu Leu Arg Arg Gly Ala Ser Lys Lys Ala Leu Ser Lys Ala Ser Pro 
2050 2055 2060 

Asn Thr Arg Ser Gly Thr Arg Arg Ser Pro Arg He Ala Thr Thr Thr 
2065 2070 2075 2080 

Ala Ser Ala Ala Thr Ala Ala Ala He Gly Ala Thr Pro Arg Ala Lys 
2085 2090 2095 



Gly Lys Ala Lys His 
2100 



wo 94/00573 



PCT/US93/06160 



- 75 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 353 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: mRNA 

(B) LOCATION: 1..353 

(D) OTHER INFORMATION: /note= "ANTI-SENSE SEQUENCE TO PART OF THE 
HTl MRNA TRANSCRIPT: N TERMINUS OF PROTEN 
CODING SEQUENCE AND UPSTREAM 53 NUCLEOTIDES." 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: complement (298. -300) 

(D) OTHER INFORMATION: /note= "MTl INITIATION CODON SEQUENCE ON 
COMPLEMENTARY STRAND." 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CTCAATTTTA ACTTGTTCTT GTTTTTCTCG TTGTGCAAGG CGAGCTGCAA CTTCTTCAGG 60 
TGGTCGCTCC CTTATAGAAG ATGAGGATGC TTCTGAAAGT GCAGGTGTGG GTTTTCCTTC 120 
ACCAATTTCA GGGTGATCAG TTTTTAAAGA TTCCTCAGGC TGAACTGCAG GGGCTGGGAC 180 
CGACAGGGTA TCACCTGCTG CAGAAATAAT TTGAGCCGCT TCTGTAGGTG CTGTTGCTGA 2A0 
AGCTGGAGTA TCTCCCTTTT GTTGTTGGAG TTGTGAGGCA GGCTGTTTAG ATTCTTTCAT 300 
TACTTCTGAT ACACTAGAGA TTTTTAGTGG ACCCGACTGA ATCGATTTCT TTG 353 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 348 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(ix)FEATURE: 

(A) NAHE/KEY: nBNA 

(B) LOCATION: 1..348 

(D) OTHER INFORKATION: /note= "ANTISENSE SEQUENCE TO PART OF HT2 
TRANSCRIPT: N TERMINUS OF PROTEIN CODING 
REGION AND UPSTREAM 48 NUCLEOTIDES." 

(ix) FEATURE: 

(A) NAME/KEY: niisc_feature 

(B) LOCATION: complement (298.. 300) 

(D) OTHER INFORMATION: /note= "MT2 INITIATION CODON SEQUENCE ON 
COMPLEMENTARY STRAND." 

(xi)SEQUENCE DESCRIPTION: SEQ ID N0:6: 

CATGGTCATC TTCGCCAGTT CCAGCTCTGA TCCCTCTAGC ACCTTCTGTG CAGATACCAG 60 

GCGTTCTGGG GAAGAGGGAT GTTTTCGATT TTTCTGCAGA AAACTGCACA CAAAGTCCAG 120 

TCTCTCTGAC ACCGGCTGCT TCTTGATTTG CTGTCCCTCT TCAGTGCCAT GGATTCTGTC 180 

AATGATCTTG ATGAAGATGC TGCAGTCCTG GAGCTGCAGC ACAGCCTCCA CAGGGTCAGC 240 

CACGTGTAGA CTGTTCACCC AAGAGAGGAG TGCAGCCCCC CGGGTGGCGT GGAGTGTCAT 300 



CTTGGTGATG CCAGACAGTC ACTCCAATGC GCCTGTAATC CCAGCTAC 



348 
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Vhat is claimed is: 



11. An isolated nucleic acid comprising the DKA 

2 sequence of Seq. ID No. 1, including variants 

3 thereof. 

1 2. An isolated nucleic acid that hybridizes with the 

2 DNA sequence of Seq- ID No. 1 under stringent 

3 hybridization conditions. 

13. A host cell transfected with the nucleic acid of 
2 claim 1 or 2. 

14. A vector comprising the nucleic acid of claim 1 
2 or 2. 

15. A protein or protein fragment encoded by the DNA 

2 sequence of Seq. ID No. 1, including variants 

3 thereof, in combination vith an adjuvant. 

16. A protein, produced by recombinant DNA in a host 

2 cell and isolated from said host cell, said 

3 recombinant DNA having the sequence of Seq. ID No. 

4 1, including variants thereof. 

17. A binding protein that binds to an epitope on the 
2 protein of claim 6. 

1 8. The binding protein of claim 7 wherein said 

2 binding protein is an antibody or an antibody 

3 fragment. 
1 
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19. A method of manufacturing an antibody for use in 

2 the detection of abnormal cell types, the method 

3 comprising the steps of: 
4 

5 a) combining a recombinantly-produced protein or 

6 protein fragment encoded by the DNA of Seq, ID 

7 No. l^or 3', including a variant thereof, with 

8 an adjuvant to form a composition suitable for 

9 injection into a mammal; 
10 

11 b) injecting the composition into a mammal 

12 to induce antibody production in said 

13 mammal against said recombinantly- 

14 produced protein or protein fragment; and 
15 

16 c) isolating said antibody from said mammal. 

1 10. The method of claim 9 wherein said step of 

2 isolating said antibody from said mammal is 

3 performed by isolating from said mammal a cell 

4 producing said antibody. 

1 11. A method of detecting an abnormal cell type in a 

2 sample containing cells or cell nucleus debris, 

3 the method comprising the steps of: 
4 

5 (a) contacting the sample with a binding 

6 protein that recognizes an epitope on a marker 

7 protein comprising an amino acid sequence 

8 encoded by the DNA of Seq. ID No, 1 or 3 or a 

9 variant thereof; and 

-10 
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11 (b) detecting the presence in the sample of 

12 said marker protein or a fragment thereof. 

1 12. The method of claim 9 or 11 wherein said abnormal 

2 cell type is a malignant cell ty^e, 

1 13. The method of claim 12 wherein said malignant cell 

2 type is characteristic of a malignamt bladder, 

3 breast, prostate, lung, colon, ovary or cervix 

4 cell type. 
1 

2 14. The method of claim 11 wherein said binding 

3 protein is an antibody that binds specifically to 

4 an epitope on said marker protein or protein 

5 fragment. 

1 15. The method of claim 14 wherein said antibody has a 

2 binding affinity for said epitope greater than 

3 10^M'\ 

1 16. The method of claim 15 wherein said antibody has a 

2 binding affinity greater than 10^M"\ 

1 17. The method of claim 11 comprising the additional 

2 step of quantitating the abundance of said marker 

3 protein in said sample. 

1 18. The method of claim 11 wherein said sample 

2 comprises a body fluid. 



wo 94/00573 



PCr/US93/06160 



- 80 - 



1 19. The method of claim 18 wherein said body fluid is 

2 selected from the group consisting of serum, 

3 plasma, blood, urine, semen, vaginal secretions, 

4 spinal fluid, ascitic fluid, peritoneal fluid, 

5 sputum, and breast exudate. 

1 20. A method for determining the degree of cell death 

2 in a tissue, the method comprising the steps of: 
3 

4 (a) contacting a sample vith a binding protein 

5 that recognizes an epitope on a marker protein for 

6 cell death, said marker protein comprising an 

7 amino acid sequence encoded by the DNA of Seq. ID 

8 No. 1 or 3 or a variant thereof; and 
9 

10 (b) detecting the concentration of said marker 

11 protein or protein fragment released from the 

12 cells of said tissue, said marker protein or 

13 protein fragment comprising an amino acid sequence 

14 encoded by the DNA sequence of Seq. ID No. 1 or 3 

15 or a variant thereof, 
16 

17 the concentration of said marker protein or 

18 protein fragment detected being indicative of the 

19 degree of cell death in said tissue. 

1 21. The' method of claim 20 comprising the additional 

2 steps of: 
3 

4 c) repeating, at intervals, the steps of 

5 detecting the concentration of said marker 
6* protein or protein~f ragments thereof; and" 
7 
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8 d) comparing said detected concentrations, 

9 wherein changes in said detected 

10 concentrations are indicative of the status of 

11 said tissue. 

1 22. The method of claim 20 for use in monitoring 

2 change in the status of a disease or the efficacy 

3 of a therapy, wherein a decrease in said detected 

4 concentrations is indicative of a decrease in cell 

5 death, and an increase in said detected 

6 concentrations is indicative of an increase in 

7 cell death. 

1 23. the method of claim 20 wherein said tissue is 

2 characteristic of breast, prostate, lung, colon, 

3 ovary, bladder or cervical tissue. 

1 24. A method of detecting an abnormal cell type in a 

2 sample containing cells or cell nucleus debris, 

3 the method comprising the steps of: 
4 

5 a) contacting the sample with a nucleic acid 

6 that hybridizes specifically to an mRNA 

7 transcript encoded by the DNA sequence of Seq. 

8 ID No. 1 or Seq- 3, said transcript, when 

9 translated, encoding the amino acid sequence 

10 of Seq. ID No 1 or Seq. ID No. 3 or a variant 

11 thereof; and 
12 

13 b) detecting the presence in the sample of 

14 said mRNA transcript or a fragment or variant 

15 thereof. 
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1 25. The method of claim 24 wherein said abnormal cell 

2 type is a malignant cell type. 

1 26. The method of claim 25 wherein said malignant cell 

2 type is characteristic of a malignant breast, 

3 prostate, lung, colon, cervix or bladder cell 

4 type . 

1 27. The method of claim 24 wherein said nucleic acid 

2 hybridizes with said mRNA transcript under 

3 stringent hybridization conditions. 

1 28. The method of claim 24 comprising the additional 

2 step of quantitating the abundance of said 

3 transcript in said sample. 

1 29. Use of a molecule capable of binding to the mRNA 

2 transcript or protein product of MTl or MT2, 

3 including variants thereof, for the manufacture of 

4 a cancer therapeutic agent. 

1 30. Use according to claim 29 wherein said canor 

2 therapeutic agent is for the treatment of breast, 

3 prostate, cervix, ovarian, bladder, colon, 

4 prostate or lung cancer. 

1 31. Use according to claim 30 wherein said molecule is 

2 an oligonucleotide complementary to at least a 

3 portion of the DNA sequence of Seq. ID No. 1 or 3. 
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1 33. 
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Use according to claim 31 wherein said 
oligonucleotide is a synthetic oligonucleotide and 
comprises at least a portion of the sequence of 
Seq. ID No. 5 or 6. 

Use according to claim 29 wherein said molecule is 
a member of a binding pair capable of binding HTl 
or MT2 or a variant thereof substantially 
irreversibly. 

Use according to claim 33 wherein said member of 
said binding pair binds MTl or MT2 or a variant 
thereof with an affinity greater than about lo' 
M-^ 

A sjnithetic oligonucleotide in admixture with a 
pharmaceutical carrier for use in the manufacture 
of a therapeutic agent, said synthetic 
oligonucleotide comprising a sequence 
complementary to at least a portion of the mRNA 
transcript of MTl or MT2 or a variant thereof. 



The synthetic oligonucleotide of claim 35 
comprising a sequence complementary to at least a 
portion of the DNA sequence of Seq, ID No. 1 or 3 
or a variant thereof. 

The synthetic oligonucleotide of claim 35 
comprising at least a portion of the sequence of 
Seq. ID No. 5 or 6 or a variant thereof. 
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1 38. The synthetic oligonucletide of claim 35 being at 

2 least 15 nucleotides in length. 

1 39. A binding protein for use in the manufacture of a 

2 medicament, said binding protein having a binding 

3 affinity of greater than about lO^M*^ for the 

4 protein encoded by the DNA of Seq. ID No. 1 or 3, 

5 or a variant thereof. 
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FIG. 4 
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♦ Assay 1 is 107.7 solid phase and 307.33 soluble phase. 

Assay 2 is 107.7 solid phase and 302.29 soluble phase. 

Assay 3 is 302.18 solid phase and 302.22 soluble phase. 
S NT means not tested. 
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